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(57) Abstract 

Methods for detecting the presence of selected mutations, such as the Thr-601 mutation and the Phe-355 mutation, in the 
plasminogen of a patient are disclosed. The methods include exposing amplified genomic DNA to a restriction endonuclease 
capable of differentially cleaving mutant and wild-type plasminogen DNA sequences, and analyzing the exposed DNA to detect 
the presence or absence of cleavage fragments diagnostic for the selected mutation. Diagnostic kits for the rapid detection of the 
selected mutation are also disclosed. 
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METHOD FOR DETECTING ABNORMAL GENES 

Technical Field 

The present invention is related generally to the 
detection of abnormal genes. More specifically, the inven- 
tion provides methods for detecting the presence of 
5 abnormal plasminogen genes, such as a gene encoding Thr-601 
plasminogen or a gene encoding Phe-355 plasminogen. 

Background of the Invention 

In order to understand the mechanisms and 

10 genetics of human diseases, it is important to identify DNA 
and protein markers that indicate the presence of genetic 
defects in populations and families. For example, deficien- 
cies in protein C, protein S, antithrombin III, heparin 
co-factor II, tissue-type plasminogen activator and plasmin- 

15 ogen have been identified as the cause or at least part of 
the cause of a predisposition for thrombosis in some 
patients with hereditary thrombophilia (for review, see 
Bauer and Rosenberg, Blood 70:343-350, 1987, and Mannucci 
and Tripodi, Thr omb . Haemostas . 57:247-251, 1987). 

20 Plasminogen is a single-chain proenzyme that is 

converted to an active two-chain form (consisting of an A 
and a B chain connected by two disulfide bonds) , called 
plasmin, by activators such as tissue-type plasminogen 
activator, urokinase, and streptokinase* Plasmin digests 

25 fibrin clots to form soluble fibrin degradation products. 
In addition, plasmin is thought to play an important role 
in various biological reactions, such as inflammation, 
tissue development and remodeling, processing other mole- 
cules, etc. 

30 The primary structure of plasminogen (790 amino 

acid residues) was established by Sottrup-Jensen et al. 
( Prog. Chem. Fibrinol. Thrombol. 3:191-209, 1978). This 
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amino acid sequence has been confirmed by cDNA sequencing 
(Malinowski et al., Biochemistry 23 :4243-4250 , 1984 and 
Forsgren et al., FEBS Lett. 213:254-260, 1987), which 
indicated the presence of an additional He residue at 
5 position 65. Accordingly, plasminogen contains 791 amino 
acids (See Figure 1). The A chain of the molecule consists 
of the activation peptide (77 amino acid residues) and five 
disulfide bond-folded structures called "kringles" (about 
90 residues each). The B chain contains the activation 
10 site (between Arg-561 and Val-562), the active site His-603 
residue region, the active site Asp-646 residue region, the 
region which is linked to the heavy chain by a disulfide 
bond, the active site Ser-741 residue region, and the 
C-terminus (amino acid numbers used herein refer to the 
15 sequence shown in Figure 1) . The first kringle structure 
(Kl) in the A chain of plasminogen is responsible for its 
binding to fibrin (Thorsen et al., Biochim. Biophys. Acta. 
668 ; 377-387, 1981)- The B chain of plasminogen carries all 
three active sites essential for catalytic function as a 
20 serine protease. 

There are at least several genes in the human 
genome that are homologous to that of plasminogen, such as 
apolipoprotein(a) [McLean et al., Nature 330 r 132-137 r 1987), 
Apolipoprotein(a) contains 37 copies of plasminogen kringle 
25 4 and one copy of plasminogen kringle 5. It also contains 
a serine protease domain that is highly homologous with the 
B chain of plasminogen* 

Several cases of a molecular abnormality of plas- 
minogen in association with a complication of thrombosis 
30 have been reported (Aoki et al., J> Clin. Invest. 61:1186- 
1195, 1978? Kazama et al., Thromb. Res. 21:517-522, 1981; 
Wohl et al., Thromb. Haemostas. 48:146-152, 1982; Soria 
et al., Thromb. Res. 32^:229-238, 1982 and Scharrar et al., 
Thromb. Hemostas. 55 : 396-401 r 1986). These abnormalities 
35 have been found most frequently in Japan, but have also 
been reported in other populations. By an analysis of the 
plasminogen molecules from these patients, it has been 
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shown that an amino acid substitution of Thr for Ala-601 in 
the B chain results in the generation of an inactive 
plasmin molecule (Sakata and Aoki, J. Biol. Chem. 255 :5442- 
5447, 1980; Miyata et al., Proc. Natl. Acad. Sci. USA 
5 79:6132-6136, 1982; Miyata et al*, J. Biochem. 96:227-287, 
1984), However, the nature of the underlying abnormality 
at the DNA level has not heretofore been determined, and 
other plasminogen disorders have not been characterized. 

Since plasminogen is the key enzyme in the 

10 fibrinolytic system, responsible for removing fibrin clots 
from circulation, individuals with abnormal plasminogen or 
a plasminogen deficiency develop thrombosis. Given the 
gene frequency of approximately 0.02 among Japanese, the 
expected number of homozygotes with the Thr-601 plasminogen 

15 variant is calculated to be about 50,000 in Japan (popula- 
tion of approximately 125 million) . A few homozygotes have 
been found? however, the homozygous condition is expected 
to be lethal in most cases. In heterozygotes , the reduced 
plasminogen activity in plasma seems to be insufficient to 

20 prevent thrombosis, which may develop after trauma and is 
manifested as deep vein thrombosis, thrombophlebitis or 
pulmonary embolism. 

Conventional biological assays for plasminogen 
activity and antigen concentration do not accurately 

25 identify the molecular basis of thrombosis, because 
plasminogen can be decreased in several acquired disease 
states, such as liver dysfunction and disseminated intra- 
vascular coagulation, or by thrombolytic therapy using 
plasminogen activators. Because proper therapy is dictated 

30 by the nature of the underlying condition, it is important 
to make a definitive diagnosis in the case of a genetic 
molecular abnormality. An additional complication in 
diagnosing plasminogen-related disorders arises from the 
high degree of homology between plasminogen and apolipo- 
35 protein(a). This homology makes it difficult to distin- 
guish between DNA sequences encoding the two proteins. 



WO 89/12697 



PCT/US89/02731 



4 

Previously described methods of identifying the 
presence of Lhe Thr-601 plasminogen mutation are not well 
suited to clinical use* Miyata et al. ( Proc. Natl, Acad. 
Sci- USA 79:6132-6136, 1982) used proteolytic digestion of 
5 plasminogen and amino acid sequence analysis of the 
resultant peptides to characterize the mutation. Aoki 
et al. ( Biochemical Genetics 22:871-881, 1984) utilized 
electrof ocusing , zymography and immunof ixa tion of 
neuraminidase-treated plasminogen. The entire procedure 
10 required four or more days to perform. 

There is therefore a need in the art for improved 
methods of detecting the presence of mutations in the 
plasminogen gene. Such methods should be technically 
simple and rapid enough to permit clinical use. The 
15 present invention provides such methods for genetic 
diagnosis at the DNA level and has the additional advantage 
of not being influenced by the presence of other disease 
conditions, 

20 Disclosure of the Invention 

Briefly stated, the present invention is directed 
toward methods for detecting the presence of a mutation in 
the plasminogen gene of a patient. In one aspect of the 
present invention, the method comprises (a) amplifying a 

25 portion of genomic DNA from a patient, the portion includ- 
ing a predetermined exon comprising the site of a selected 
mutation and at least 14 base pairs of each of two intron 
sequences flanking the exon; (b) exposing the amplified DNA 
to a restriction endonuclease capable of differentially 

30 cleaving DNA having the selected mutation and wild-type 
- plasminogen DNA, under conditions suitable for activity of 
the endonuclease? and (c) analyzing the exposed DNA to 
detect the presence or absence of cieavage fragments 
diagnostic for the selected mutation. Within preferred 

35 embodiments, the selected mutation is the Phe-355 mutation 
or the Thr-601 mutation. The method may also include, 
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prior to the step of amplifying, isolating genomic DNA from 
the patient. 

Within a related aspect of the present invention, 
a method of detecting the presence of a mutation in the 
5 plasminogen gene of a patient is disclosed, wherein the 
method generally comprises (a) denaturing genomic DNA from 
the patient; (b) annealing the denatured genomic DNA to a 
pair of oligonucleotide primers, wherein the first primer 
is complementary to a first sequence of at least about 

10 fifteen consecutive nucleotides of a first intron on the 
coding strand of the genomic DNA, and wherein the second 
primer is complementary to a second sequence of at least 
about fifteen consecutive nucleotides of a second intron on 
the noncoding strand of the genomic DNA, the introns 

15 flanking the exon comprising the site of a selected 
mutation; (c) extending the annealed primers to produce 
double-stranded DNA fragments, the fragments including 
the site of the selected mutation; (d) denaturing the 
double-stranded DNA fragments; (e) annealing the denatured 

20 DNA fragments to the pair of oligonucleotide primers and 
extending the annealed primers to produce selectively 
amplified DNA; (f) exposing the selectively amplified DNA 
to a restriction endonuclease capable of differentially 
cleaving DNA having the selected mutation and wild-type 

25 plasminogen DNA, under conditions suitable for activity of 
the endonuclease; and (g) analyzing the exposed DNA to 
detect the presence or absence of cleavage fragments 
diagnostic for the selected mutation, wherein the selected 
mutation is the Phe-355 mutation or the Thr-601 mutation. 

30 Within a preferred embodiment, the primers are extended 
using Taq DNA polymerase. 

Within another aspect of the present invention, a 
diagnostic kit for the rapid detection of the Thr-601 muta- 
tion in the plasminogen gene of a patient is disclosed. 

35 The kit includes, within suitable compartments: a pair of 
oligonucleotide primers, the first primer being complemen- 
tary to a first sequence of at least about fifteen consecu- 
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tive nucleotides of an intron on the coding strand of 
genomic DNA from a patient ,* the second primer being 
complementary to a second sequence of at least about 
fifteen consecutive nucleotides of a second intron on the 
5 noncoding strand of the genomic DNA, the introns flanking 
the exon coding for amino acid residue 601 of plasminogen; 
Taq DNA polymerase; control DNA; a restriction endonuclease 
capable of differentially cleaving Ala-601 plasminogen DNA 
and Thr-601 plasminogen DNA; and suitable buffers. 
10 Within yet another aspect of the present 

invention, a diagnostic kit for the rapid detection of the 
Phe-355 mutation in the plasminogen gene of a patient is 
provided. The kit comprises, contained within suitable 
compartments, (a) a pair of oligonucleotide primers, the 
15 first primer being complementary to a first sequence of at 
least about fifteen consecutive nucleotides of an intron on 
the coding strand of genomic DNA from a patient, the second 
primer being complementary to a second sequence of at least 
about fifteen consecutive nucleotides of a second intron on 
20 the noncoding strand of the genomic DNA, the introns 
flanking the exon coding for amino acid 355 of plasminogen; 
(b) Taq DNA polymerase; (c) control DNA; (d) a restriction 
endonuclease capable of differentially cleaving Val-355 
plasminogen DNA and Phe-355 plasminogen DNA; and 
25 (e) suitable buffers. 

These and other aspects of the present invention 
will become evident upon reference to the following 
detailed description and attached drawings. 

30 Brief Description of the Drawings 

Figure 1 illustrates the cDNA sequence and amino 
acid sequence of plasminogen. The positions of certain 
restriction enzyme recognition sites are shown. Numbers in 
the left margin refer to nucleotide positions. Numbers 
35 above the sequence refer to amino acid positions. 

Figure 2 illustrates portions of the sequence of 
the normal human plasminogen gene, N indicates an undeter- 



WO 89/12697 



PCT/US89/02731 



7 

mined nucleotide. Arrows indicate exon-intron boundaries, 
Exon sequences are underlined and labeled with Roman 
numerals. The 5 1 end of exon I and the 3* end of exon XIX 
were not determined; the 3* end of exon XIX is shown at the 
5 proposed polyadenyla tion signal. The partial gene sequence 
is presented in 10 sections, labeled a through j , showing: 
a, exon I and adjacent intron sequences; b, exons II and 
III and adjacent intron sequences; c, exon IV and adjacent 
intron sequences; d, exon V and adjacent intron sequences; 

10 e, exon VI and adjacent intron sequences; f , 10,000 base 
pairs comprising exons VII, VIII, IX and X; g, 10,000 base 
pairs comprising exons XI, XII and XIII; h, 10,000 base 
pairs comprising exons XIV, XV, XVI and XVII; i, intron 
sequence (4473 bp); and j, exons XVIII and XIX with 

15 adjacent intron sequences. Nucleotides in each of sections 
a through j are independently numbered as designated in the 
right margin, beginning with 1. 

Figure 3 illustrates a portion of the genomic DNA 
sequence encoding plasminogen and the sequences of two sets 

20 of oligonucleotide primers (designated A39, 1A, 10A and 
11A) used to selectively amplify a portion of the genomic 
DNA. The locations of certain restriction enzyme recogni- 
tion sites are indicated. 

Figure 4 shows the results of a Fnu 4HI digest of 

25 selectively amplified genomic DNAs from three unrelated 
patients with abnormal plasminogen and a normal individual. 
The molecular weight marker is a 123-bp ladder obtained 
from Bethesda Research Laboratories. Abl, II and III refer 
to samples from abnormal patients I, II and III, 

30 respectively. 

Best Mode for Carrying Out Lhe Invention 

Prior to setting forth the invention, it may be 
useful to define certain terms used herein. 
35 Selectively amplifying: The process of increas- 

ing the copy number of a preselected DNA sequence or 
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fragment relative to the copy number of other sequences or 
fragments in a sample. 

Differentially cleaving; Cleaving a first 
sequence or set of sequences but not cleaving a second 
5 sequence or set of sequences- Restriction endonucleases 
differentially cleave DNA sequences due to their ability to 
specifically recognize short stretches of paired bases , 
frequently palindromic sequences of four to six base pairs. 
Cleavage may occur within the recognition sequence or at 
10 some specific distance away from the recognition sequence. 

Site of the selected mutation : The position in a 
gene at which a mutation is known to occur, regardless of 
whether that particular allele carries the mutant or wild- 
type sequence at the site. 

15 

As noted above f reduced plasminogen activity can 
lead to thrombotic episodes. Also as noted above, such a 
reduction in activity. can result from a variety of causes, 
including genetic abnormalities. Practical methods of 
20 clinical screening for genetic abnormalities in plasminogen 
have heretofore been unavailable. 

The present invention provides methods useful in 
diagnosing cases of thrombosis , in genetic screening and in 
prenatal diagnosis. The methods are simple, rapid, and do 
25 not require the use of radioactive isotopes, so are particu- 
larly useful in many clinical laboratories that lack in the 
special facilities necessary for handling radioisotopes. 

The present invention is related, in part, to the 
elucidation of the human plasminogen gene sequence, 
30 portions of which are shown in Figures 2 and 3. Knowledge 
of this sequence has permitted the design of oligonucle- 
otide primers that may be used to selectively amplify those 
portions of the gene encoding amino acid residue 601 or 
amino acid residue 355. In a similar manner, other 
35 abnormal plasminogen gene sequences may be analyzed, 
allowing those skilled in the art to selectively amplify 
exons comprising sites of other selected mutations. 
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The methods of the present invention are applied 
to genomic DNA samples from a patient. In one embodiment , 
the genomic DNA is first isolated, using conventional proce- 
dures, A convenient source of isolated genomic DNA is 
5 leukocytes , which may be readily obtained from a small 
(e.g., 10 ml) blood sample. Other cell types may also be 
used. DNA may be isolated from leukocytes using the tech- 
nique of Bell et al. ( Proc. Natl. Acad. Sci. USA 78:5759- 
5763, 1981). Briefly, blood is collected in the presence 

10 of an anticoagulant, the cells are lysed, and the nuclei 
are collected. The nuclei are then treated with sodium 
dodecyl sulfate and proteinase K and the DNA is extracted 
from the mixture with phenol/chlorof orm/isoamyl alcohol. 
The DNA is then precipitated and resuspended in a suitable 

15 buffer, such as 10 mM Tris-HCl (pH 7.5), 1 mM EDTA. Alter- 
natively, by using the method disclosed disclosed by Kogan 
et al. ( New Eng. J. Med. 317:985-990, 1987), the methods 
of the present invention may be applied directly to tissue 
samples, without the need to isolate the DNA. For example, 

20 chorionic villus samples can be screened directly by 
disrupting the tissue by vortexing in a solution of 0.1M 
NaQH, 2M NaCl , 0.5% SDS . The sample is then boiled for two 
minutes, centrifuged, and an aliquot is taken for amplifica- 
tion. This facilitates the application of these methods to 

25 prenatal diagnosis of the plasminogen abnormality. 

Genomic DNA (either isolated or in the form of a 
suitable tissue sample) is then selectively amplified to 
provide a high copy number of the desired portion of the 
plasminogen gene (e.g., the portion encoding amino acid 

30 residue 601 or the portion encoding amino acid residue 355). 
Preferably, a sequence of approximately 200-1,000, most 
preferably about 300-400, base pairs is selectively 
amplified. In a preferred embodiment, the exon encoding 
amino acid 601 and portions of the intron sequences flank- 

35 ing this exon are selectively amplified. Similarly, the 
exon encoding amino acid 355 and portions of the flanking 
introns may be selectively amplified. A preferred method 
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of amplification is the polymerase chain reaction, 
described by Mullis (U.S. Patent Nos. 4,683,202 and 
4,683,195). Briefly, the genomic DMA is denatured to 
separate the coding and noncoding strands. Denaturation is 
5 preferably accomplished by heat treatment of the DNA, 
generally treatment at about 80°C-105°C for about one to 
ten minutes, although enzymatic denaturation may also be 
used. Most preferably,, the DNA is heated at about 93°C for 
one minute. The denatured DNA is then combined with a 
10 molar excess of a pair of oligonucleotide primers under 
conditions which allow the DNA strands to anneal to the 
primers (e.g., 60°C for one to three minutes, preferably 
about two minutes) . Preferably , each primer is used at a 
concentration of about IfiM for amplification of one micro- 
15 gram of genomic DNA. Suitable results may be obtained with 
5 tig of primer per pg of target DNA. One of the primers is 
complementary to a sequence on the coding strand and the 
second primer is complementary to a sequence on the noncod- 
ing strand, the sequences flanking the region to be 
20 amplified* "Sequences flanking the region to be amplified" 
include exon sequences, sequences of introns immediately 
adjacent to the exon to be amplified and sequences of other 
introns, so long as the amplified region includes the site 
of the selected mutation. The flanking sequences should be 
25 selected so as to provide an amplified portion of the gene 
within the size limits noted above. Although 100% comple- 
mentarity is not required, a high degree of complementarity 
of primer and genomic DNA is advantageous in that it 
results in high specificity and efficiency of amplification. 
30 For use within the present invention, the primers must be 
sufficiently complementary to hybridize with their respec- 
tive strands on the genomic DNA. The annealed primers are 
enzymatically extended using a DNA polymerase and all four 
deoxyribonucleotide triphosphates (dNTP's). .Suitable 
35 polymerases include coli DNA polymerase I r the Klenow 
fragment of E. coli DNA polymerase I, Taq DNA polymerase, 
and T4 DNA polymerase. Taq DNA polymerase (Saiki et al., 
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Science 239 :487-491, 1988) is particularly preferred. The 
reaction mixture is incubated under conditions of time and 
temperature suitable for the activity of the polymerase. 
When using the Taq DNA polymerase the mixture is incubated 
5 at about 70°C + 10°C for approximately three minutes. As 
will be appreciated by one skilled in the art, the exact 
time and temperature will be determined by the melting 
point of the annealed DNA. The resulting extension 
products are separated from the original DNA strands, 

10 preferably by heat denaturation. The annealing, extension 
and separation steps are then repeated, preferably about 25 
to 30 times, until the desired degree of amplification is 
obtained. At that time, the final separation step is 
omitted, and double-stranded DNA is isolated. In general, 

15 it is preferred to add the primers and dNTP's at the 
beginning of the amplification reaction in sufficient 
quantity to allow full amplification to occur without the 
need to add additional reagents during the course of the 
reaction series. The use of Taq DNA polymerase facilitates 

20 such a process, as this heat-stable enzyme is not inacti- 
vated by the heat denaturation steps and the reaction need 
not be interrupted for the addition of more polymerase. 

As noted above, oligonucleotide primers for use 
in the polymerase chain reaction are constructed to be 

25 complementary to sequences flanking an exon comprising the 
site of a selected mutation, such as the exon containing 
the codon for amino acid 601 or the exon containing the 
codon for amino acid 355. A first primer is designed to be 
complementary to a sequence on the coding strand, and a 

30 second primer is complementary to a sequence on the 
noncoding strand of the DNA. Preferably, the primers will 
be complementary to intron sequences because intron 
sequences will exhibit the least amount of intergene 
homology. The primers are preferably at least about 15-20 

35 bases in length, more preferably at least about 25 bases in 
length. Primers shorter than about 20 bases will often 
have reduced specificity, and may anneal to and amplify 
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unwanted sequences. Primers are preferably less than 50 
bases in length, more preferably less than about 30 bases 
in length. Longer primers may self-anneal or their use may 
lead to reduced specificity . 
5 Within the present invention, alternative methods 

of DNA amplification may also be used. For example, a 
genomic library may be prepared by digesting genomic DNA 
from a patient and cloning the resultant DNA fragments into 
a suitable vector (e.g., plasmid, cosmid or bacteriophage). 

10 The library is then amplified by conventional methods, and 
plasminogen-encoding clones are screened for. the presence 
of the mutation. 

The amplified DNA is then incubated with a 
restriction endonuclease which is capable of differentially 

15 cleaving normal and abnormal plasminogen DNA. Suitable 
restriction endonucleases for identification of the Thr-601 
mutation include Fnu 4HI and Bbv I. Endonucleases suitable 
for identification of the Phe-355 mutation include Ava II, 
Bam Nxl, Cau I (Bingham and Darbyshire, Gene 18:87-91, 

20 1982; Molemans et al., Gene 18:93-96, 1982), Hgi BI, 
Hgi CII, Hgi EI and Sau 961. However, the invention is not 
limited to the use of particular enzymes, but is intended 
to include the use of other suitable enzymes which may from 
time to time become available. Restriction endonucleases 

25 are commercially available from, for example, New England 
Biolabs (Beverly, Mass.), Bethesda Research Laboratories 
(Gaithersburg, Md . ) and other suppliers. The amplified DNA 
is incubated with the endonuclease under conditions of 
time, temperature and buffer composition suitable for the 

30 activity of the endonuclease. Such conditions are 
generally specified by the supplier. 

Following exposure to the restriction endo- 
nuclease, the DNA sample is analyzed to detect the presence 
or absence of cleavage fragments diagnostic for the 

35 selected mutation, for example by electrophoretic separa- 
tion of DNA fragments. In a preferred embodiment, the DNA 
is electrophoresed on an agarose gel containing ethidium 



WO 89/12697 



PCI7US89/02731 



13 

bromide. Endonuclease Fnu 4HI cleaves the normal plasmino- 
gen sequence at the codon for Ala-601. The presence of the 
Thr-601 mutation prevents this cleavage, resulting in no 
change in fragment size following exposure to the enzyme. 
5 Priming in the introns flanking the codon for amino acid 
601 as disclosed in more detail below resulted in amplifica- 
tion of a "340 bp fragment. The normal sequence could be 
cleaved by Fnu 4HI to yield fragments of about 240 bp and 
100 bp. Also, as discussed in more detail below, the 

10 mutation of Val-355 to Phe can be detected by amplifying a 
""390 bp fragment, digesting the amplified DNA with Ava II 
and analyzing the digested DNA. The Phe-355 mutation 
results in the presence of a 360 bp fragment, which is not 
present in the Ava II digest of wild-type DNA. 

15 The methods described herein are well suited to 

clinical use. In particular, the combination of the 
polymerase chain reaction and restriction analysis can be 
used to diagnose the specific plasminogen abnormality at 
the DNA level in a rapid and straightforward manner. 

20 Partial purification of genomic DNA from leukocytes takes 
several hours, and amplification by the polymerase chain 
reaction takes about three hours. Restriction digestion of 
the amplified DNA and its analysis on agarose gels require 
about one hour or less each. Therefore, the entire 

25 diagnostic procedure can be performed in a single day. 

As briefly described above, suitable kits for 
diagnosing these plasminogen mutations contain oligonuc- 
leotide primers, Taq DNA polymerase, an appropriate 
restriction enzyme, buffers, and normal (control) DNA in 

30 appropriate packaging. 

The following examples are offered by way of 
illustration, and not by way of limitation. 

EXPERIMENTAL 

35 

Taq DNA polymerase was obtained from New England 
Biolabs and The Perkin Elmer Corporation (Norwalk, Conn.). 



WO 89/12697 



PCT/US89/02731 



14 

Restriction endonucleases and T4 DNA iigase were purchased 
from Bethesda Research Laboratories (Gaithersburg, Md.) or 
New England Biolabs. The Klenow fragment of Escherichia 
coli DNA polymerase, bacterial alkaline phosphatase, ATP, 
5 deoxynucleo tides, dideoxynucleo tides, M13mpl8, and M13mpl9 
were supplied by Bethesda Research Laboratories. dATPfa- 
35sj was provided by Amersham (Chicago, 111.). 

Oligonucleotides were synthesized using a 
nucleotide synthesizer (Applied Biosys terns, Foster City, 
10 Calif.) and kindly supplied by Drs. Patrick S.Ef. Chou, yim 
Foon Lee and Jeff Harris. 

Example 1 

Leukocyte genomic DNA samples were obtained from 
15 three unrelated Japanese patients with abnormal plasminogen 
(named abnormal I, II and III, respectively) r a daughter of 
abnormal III (abnormal III-2) and three unrelated normal 
American white individuals. Abnormals I, II and III-2 had 
a history of thrombosis, but abnormal III did not. The 
20 plasma of abnormal I had a trace of plasminogen activity in 
spite of a normal plasminogen antigen concentration, and 
the plasma from the mother and a sister of abnormal I 
showed a 50% reduction in enzymatic activity of plasminogen. 
Accordingly, abnormal I is a homozygote of a nonfunctional 
25 plasminogen variant. Abnormal II is a heterozygote of a 
plasminogen variant, since the plasminogen in the plasma of 
the patient and his two daughters has about half of the 
specific activity (activity per antigen) of normal plasmino- 
gen. Abnormal III is a homozygote of the plasminogen 
30 variant named PLG B (Nishimukai et al., Hum. Hered. 36 :137- 
142, 19&6) as determined by isoelectric focusing. Abnormal 
III-2 is a heterozygote of PLG B with a normal plasminogen 
concentration and half of normal specific activity. 

Genomic DNA samples were prepared from the leuko- 
35 cytes of the patients with abnormal plasminogen and from 
normal individuals by the method of Bell et al. (ibid.). 
Typically, 10-40 ml of blood is collected in citrate buffer. 
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Ten ml of blood i s added to 90 ml of 0,32 M sucrose, 10 mM 
Tris pH7.5, 5mM MgCl2, 1* Triton X-100, and the mixture is 
incubated at 4°C to lyse the cells. Nuclei are collected 
by centrif ugation at 1,000 x g for 10 minutes and resus- 
5 pended in 4.5 ml of 0-075 M NaCl, 0,024 M EDTA, pH 8.0. 
The nuclei are treated with SDS and proteinase K, and the 
DNA is extracted with chlorof orm/phenol/isoamyl alcohol, 
precipitated with ethanol and resuspended in the appropri- 
ate buffer . ' -* 

10 Nucleotide primers A39 and 1A (Figure 3) for the 

putative introns N and 0 flanking the exon coding for the 
amino acid residue-601 of plasminogen (exon XV) were 
synthesized for the polymerase chain reaction. These 
regions were selected because they lie outside the putative 

15 exon 15, and upon selective amplification they produce a 
fragment of a length suitable for analysis by restriction 
digestion and DNA sequencing. Both the 5'- and 3' -ends 
were modified to generate convenient restriction sites 
(Hind III) for cloning directly into the M13 sequencing 

20 vector. One pg of genomic DNA was amplified in a 100 pi 
reaction mixture containing 50 mM KC1, 10 mM Tris (pH 8.4), 
2.5 mM MgCl2, each primer (A39 and 1A, Figure 3) at 1 pM, 
each dNTP at 200 pM, gelatin at 200 pg/ml, and 2.5 units of 
Taq DNA polymerase (Saiki et al., Science 239 :487-491, 

25 1988). The sample was placed in a small Eppendorf tube and 
overlaid with 100 pi of mineral oil to prevent evaporation. 
The sample was heated at 93°C for one minute to denature 
the DNA, cooled to 60°C for two minutes to anneal the 
primers, and incubated at 70°C for three minutes to extend 

30 the annealed primers. The procedure was repeated for a 
total of 25-30 cycles of amplification. At the end of the 
last cycle, the sample was incubated at 70°C for 7 minutes 
to ensure the completion of the final extension step. 
After precipitation with ethanol and resuspension in 100 pi 

35 TE buffer (10 mM Tris-HCl, pH 7.5, 1 mM EDTA), 5 pi was 
applied to a 1.5% agarose gel for submerged electrophore- 
sis, and stained with ethidium bromide. A discrete band of 
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about 340 bp was obtained for each sample, as predicted 
from the sequence of the gene for normal plasminogen. 

The samples from abnormals I, II, III, III-2 and 
normal individuals were digested with three units of Fnu 
5 4HI endonuclease for one hour or with six units of enzyme 
for four hours at 37°C. Five microliters of each sample 
was then applied to a 1.5% agarose gel containing ethidium 
bromide. The 340 bp fragment of normal DNA was cleaved 
into two fragments (about 240 and 100 bp), while that of 

10 the DNA from abnormal III remained unchanged (Figure 4). 
The Fnu 4HI digests of the 340 bp fragments from abnormals 
II and III-2 each showed a mixed pattern of normal DNA and 
the DNA from abnormal III. In contrast, the DNA from 
abnormal I was cleaved completely. Prolonged digestion of 

15 the samples for four hours with six units of enzyme gave 
exactly the same results (Figure 4). The amplification and 
digestion of the genomic DNAs from abnormals I, II, III and 
III-2 was performed eight, two, three and two times, respec- 
tively, and the results obtained were the same in each 

20 experiment for each sample. Fnu 4HI recognizes only th 
GCNGC sequence, suggesting that one or more of these fou 
nucleotides in the DNAs of abnormals III, III-2, and II i;» 
replaced by other nucleotides. Alternatively, a short 
stretch of nucleotides could be deleted or inserted in the 

25 abnormal DNA. 

To characterize the mutation(s) at the DNA level, 

the amplified fragments were sequenced. Since both" the 5'- 

and 3 '-end primers were designed to produce double-stranded 

fragments flanked by Hind III recognition sequences, the 

30 amplified 340 bp fragments from normal and abnormal indivi- 
duals were digested with Hind III and ligated into M13 
sequencing vectors cut with Hind III. In order to obtain 
the DNA sequence coding for the specific region around 
amino acid residue 601, the amplified DNAs were also 

35 digested with Hinc II and Pst I endonucleases. The 
digested samples were electrophoresed on a 1.5% agarose 
gel electroeluted, and dialyzed against 0 . IX TBE (IX TBE 
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is 89 mM Tris-borate, 89 mM boric acid, 20 mM EDTA) over- 
night. The dialyzed samples were extracted with phenol and 
chloroform, precipitated with ethanol, resuspended in TE, 
and finally subcloned into M13mpl8 or mp!9 in order to 
5 obtain discrete overlapping sequences. The DNA sequences 
of the inserts were then obtained using the dideoxynucle- 
otide method (Sanger et al. Proc. Natl. Acad. Sci. USA 
7_4 :5463-5467, 1977) with dATP [a- 35 S] and buffer gradient 
gels (Biggin et al. Proc. Natl. Acad. Sci. USA 80:3963- 

10 3965, 1983). 

The DNA sequences obtained from the three normal 
individuals included 343 bp. These sequences were the same 
as expected for the normal gene except for the presence of 
Hind III sites at both the 5 1 - and 3' ends. The sequence of 

15 the Hinc II-Pst I fragments from the normal DNAs included 
205 bp, and was also the same as the established sequence 
of the normal gene for plasminogen. The actual sequence of 
the region coding for amino acid 601 (Ala) included 
ACTGCTGC in the normal gene. 

20 On the other hand, the DNA sequence analysis of 

both Hind III and Hinc II-Pst I fragments of abnormal III 
revealed that the gene of abnormal III contained the 
sequence ACTACTGC This corresponds to a single base 
change resulting in the substitution of Thr (ACT) for Ala 

25 (GCT) . Twenty-three templates .from the amplified samples 
of abnormal III were sequenced and all of them showed the 
same abnormal sequence (G to A change). No other altera- 
tions of nucleotides were found by DNA sequence analysis. 

When twelve templates for abnormal II were 

30 sequenced, one-half of them showed the same sequence as the 
normal gene except for a point mutation (T to C) 5 nucle- 
otides prior to the Fnu 4HI site, and the other half had 
the same abnormal sequence as abnormal III. These results 
confirmed that abnormal III is a homozygote of a plasmino- 

35 gen variant and that abnormal II is a heterozygote of the 
same variant allele. 
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The exon XV DNA sequence of abnormal I was the 
same as that of the normal gene, indicating that the 
abnormality in this molecule is in another region. 

A second set of primers (designated 10a and 11A 
5 in Figure 3} r flanked by Eco RI recognition sequences and 
four additional nucleotides f was used to confirm the 
results- A band of 360 bp was obtained for each sample as 
predicted. 

10 Example 2 

Plasminogen gene exon X DNA of abnormal I was 
amplified essentially as described above using primers K4a- 
5 r (5' GTC AGA ATT CTC AGA GGC TAG CGT ACT 3 1 ; coding 
strand primer) and K4a-3 f (5 f CTA CGA ATT CTG GCT CTA ACA 

15 CAA ATT TCC 3'; noncoding strand primer). The amplified 
DNA was digested with Eco RI, and the resulting ~390 bp 
fragment was cloned into an M13 phage vector and sequenced. 
Sequence analysis revealed the presence of the sequence 
GTGTTCCAG in six of the templates, as compared to the wild- 

20 type sequence GTGGTCCAG - This T for G substitution results 
in the substitution of a phenylalanine residue for the 
normal valine residue at amino acid position 355, located 
several residues upstream of Kringle 4 in the A chain 
(Figure 1) . 

25 DNA samples from normal and abnormal individuals 

were digested with five units of Ava II endonuclease for 
one hour at 37°C. The 390 bp DNA fragment from the normal 
individuals was cleaved into three fragments of approxi- 
mately 230 bp f 130 bp and 30 bp. DNA samples from abnormal 

30 I and two daughters (abnormals 1-2 and 1-3) and a nephew 
(abnormal 1-4) of abnormal I showed a mixture of 360 bp r 
230 bp, 130 bp and 30 bp fragments. These results indi- 
cated that the abnormal patients were heterozygous for the 
Phe-355 mutation- Thus, this mutation can be diagnosed by 

35 the presence of a 360 bp Ava II fragment when DNA is 
selectively amplified using primers K4a-5 ' and K4a-3 r . 
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In a second series of experiments, DNA from 
abnormals I, 1-2, 1-3 and 1-4 was amplified using primers 
K4a-5' and K4a-32 (5* AAA TGA ATT CCT AGG AAG TTG GCT TGA 
AGC 3 r ; noncoding strand primer). Digestion of the 
5 resulting "370 bp fragment with Ava II confirmed the loss 
of an Ava II site in the abnormal DNA, and also confirmed 
the diagnosis of abnormals I, 1-2 , 1-3 and 1-4 as hetero- 
zygotes of the Phe-355 mutation. 

10 From the foregoing, it will be appreciated that, 

although specific embodiments of the invention have been 
described herein for purposes of illustration, various 
modifications may be made without deviating from the spirit 
and scope of the invention. Accordingly, the invention is 

15 not limited except as by the appended claims. 
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Claims 

1. A method of detecting the presence of a mutation 
in the plasminogen gene of a patient , comprising: 

amplifying a portion of genomic DNA from the patient, 
said portion including a predetermined exon comprising the site 
of a selected mutation and at least 14 base pairs of each of 
two intron sequences flanking said predetermined exon; 

exposing said amplified DNA to a restriction 
endonuclease capable of differentially cleaving DNA having the 
selected mutation and wild-type plasminogen DNA under 
conditions suitable for activity of the endonuclease; and 

analyzing the exposed DNA to detect the presence or 
absence of cleavage fragments diagnostic for the selected 
mutation. 

2. The method of claim 1 wherein the selected 
mutation is the Phe-355 mutation or the Thr-601 mutation. 

3- A method of detecting the presence of a mutation 
in the plasminogen gene of a patient r comprising: 

a. denaturing genomic DNA from the patient; 

b. annealing the denatured genomic DNA to a pair of 
oligonucleotide primers, wherein the first primer 
is complementary to a first sequence of at least 
about fifteen consecutive nucleotides of a first 
intron on the coding strand of the genomic DNA, 
and wherein the second primer is complementary to 
a second sequence of at least about fifteen 
consecutive nucleotides of a second intron on the 
noncoding strand of the genomic DNA, wherein said 
introns flank the exon comprising the site of a 
selected mutation; 

c. extending the annealed primers to produce double- 
stranded DNA fragments, said fragments including 
the site of the selected mutation; 
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d. denaturing the double-stranded DNA fragments; 

e. annealing the denatured DNA fragments to the pair 
of oligonucleotide primers and extending the 
annealed primers to produce selectively amplified 
DNA? 

f. exposing said selectively amplified DNA to a 
restriction endonuclease capable of differen- 
tially cleaving DNA having the selected mutation 
and wild-type plasminogen DNA, under conditions 
suitable for activity of the endonuclease; and 

g. analyzing the exposed DNA to detect the presence 
or absence of cleavage fragments diagnostic for 
the selected mutation, wherein the selected 
mutation is the Phe-355 mutation or the Thr-601 
mutation. 



4. The method of claim 3 wherein the primers are 
extended using Taq DNA polymerase. 

5. The method of claim 3 wherein each of said first 
and second primers is from about twenty to about thirty 
nucleotides in length, inclusive. 

6. The method of claim 3 wherein said selected 
mutation is the Thr-601 mutation and said first primer includes 
the sequence CAA TTT AAC TAA A AT TTG AAC TAA AT or TGT AC A ATG 
GAG CAG AAC AAA. 

7. The method of claim 3 wherein said selected 
mutation is the Thr-601 mutation and said second primer 
includes the sequence TCA TGT CTA CTA AAA CAC CCG GAC TTA or 
TCT CCT TTC TGT GTC ATG TCT A. 

8. The method of claim 3 wherein said selected 
mutation is the Phe-355 mutation and said first primer includes 
the sequence GTC AGA ATT CTC AGA GGC TAG CGT ACT. 
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9. The method of claim 3 wherein said selected 
mutation is the Phe-355 mutation and said second primer 
includes the sequence CTA CGA ATT CTG GCT CTA ACA CAA ATT TCC 
or AAA TGA ATT CCT AGG AAG TTG GCT TGA AGC- 

10. The method of claim 3, further comprising the 
step of isolating genomic DMA from the patient prior to the 
step of denaturing the genomic DNA. 

11. The method of claim 3 wherein the endonuclease 
differentiates between G and A in the first position of the 
codon for amino acid 601 of plasminogen * 

12^ The method of claim 11 wherein the restriction 
endonuclease is selected from the group consisting of Fnu 4HI 
and Bbv I. 

13. The method of claim 3 wherein said endonuclease 
differentiates between G and T in the first position of the 
codon for amino acid 355 of plasminogen. 

14. The method of claim 13 wherein the restriction 
endonuclease is selected from the group consisting of Ava II 
and Sau 961. 

15. The method of claim 3 wherein the steps of 
denaturing comprise heat treatment of the DNA. 

16. The method of claim 3 wherein approximately 300- 
400 bp of genomic DNA is amplified* 

17. The method of claim 3 wherein steps d and e are 
repeated in sequence from about twenty-three to about twenty- 
eight times prior to step f. 
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18. A diagnostic kit for the rapid detection of the 
Thr-601 mutation in the plasminogen gene of a patient, 
comprising in suitable compartments within the kit: 

a pair of oligonucleotide primers, the first primer 
being complementary to a first sequence of at least about 
fifteen consecutive nucleotides of an intron on the coding 
strand of genomic DNA from a patient, the second primer being 
complementary to a second sequence of at least about fifteen 
consecutive nucleotides of a second intron on the noncoding 
strand of the genomic DNA, the introns flanking the exon coding 
for amino acid residue 601 of plasminogen? 

Taq DNA polymerase; 

control DNA; 

a restriction endonuclease capable of differentially 
cleaving Ala-601 plasminogen DNA and Thr-601 plasminogen DNA; 
and 

suitable buffers. 

19. A diagnostic kit for the rapid detection of the 
Phe-355 mutation in the plasminogen gene of a patient, 
comprising in suitable compartments within the kit: 

a pair of oligonucleotide primers, the first primer 
being complementary to a first sequence of at least about 
fifteen consecutive nucleotides of an intron on the coding 
strand of genomic DNA from a patient, the second primer being 
complementary to a second sequence of at least about fifteen 
consecutive nucleotides of a second intron on the noncoding 
strand of the genomic DNA, the introns flanking the exon coding 
for amino acid 355 of plasminogen; 

Taq DNA polymerase; 

control DNA; 

a restriction endonuclease capable of differentially 
cleaving Val-355 plasminogen DNA and Phe-355 plasminogen DNA; 
and 

suitable buffers. 
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Ball 
t 

1 GCPCTGCTGGCCAGTCCCAflAATSGAfiCATftAGGfiftGTGGTTCTTCTflCTTCTTTTftTTTCTGftflqTCn 

METG I uH i sUy sG 1 u V a I V a 1 Le uUeuLe uLeuUe uPh eL e uLy sS e r 

♦I 

755 GGTCAAGGAGflGCCTCTGGATGACTATGTGAATflCCCAGGSGGCTTCACTGTTCAGTGTCflCTAAGflAG 
GlyGtnGlyGluProLeuAspAspTyrValAsnThrGInGXyAlaSerLeuPheSerValThrLysLys 
PvuII EcoRI PstI 

21 I II 
139 CAGCTGGGAGCAGGAAGTATAGAAGAATGTGCAGCAAAATGTGAGGAGGACGAAGAATTCACCTGCAGG 
GlnUeuGlyAlaGlySerlleGluGluCysAlaAlaLysCysGluGXuAspGluGluPheThrCysArg 

44 

ECS GCATTCCAATATCACAGTAAAGAGCAACAATGTGTGATAATGGCTGAAAACAGGAAGTCCTCCATAATC 
Al aPheGl nTyrHi_s£erLysGl uGl nGl nCysVal 1 1 eMETAlaG 1 uflsnAr gLysSerSer I lei le 

67 

£77 ATTAGGATGAGAGATGTAGTTTTATTTGAAAAGAAAGTGTATCTCTCAGAGT6CAAGACTGGGAATGGA 
IleArgMETArgAspVaiValLeuPheGluLysLysValTyrLeuSerGluCysLysThrGlyAsnGly 

90 

346 PAGAACTACAGAGGGACGATGTCCAAAACAAAAAATGGCATCACCTGTCAAAAATGGAGTTCCACTTCT 
LysAsnTyrArgGlyThrMETTSerLysThrLysAsnGlylleThrCysGlnLysTrpSerSer-ThrSer 

PstI 

113 1 
415 CCCCACAGACCTAGATTCTCACCTGCTACACACCCCTCAGAGGGACTGGAGGAGAACTACTGCAGAAAT 
ProHisArgProArgPheSerProAlaThrHisProSerGluGlyLeuGluGiuAsnTyrCysArgAsn 

Apal 

136 i 
484 CCAGACAACGATCCGCAGGGGCCCTGGT6CTATACTACT6ATCCAGPAAA6AGATATGACTACTGCGAC 
ProAspAsnAspProGlnGlyProTrpCysTyrThrThrAspProGIuLysArgTyrAspTyrCysAsp 

Nsil 

159 I 
553 ATTCTTGAGTGTGAAGA6GAATGTAT6CATTGCAGTGGAGAAAAGTATGACGGCAAAATTTCCAAGACC 
IleLeuGluCysGluGiuGluCysMETHisCysSerGlyGluAsnTyrAspGlyLysIleSerLysThr 

StuI 

182 I 
622 ATGTCTGGACT6GAATGCCAGGCCTGGGACTCTCAGAGCCCACACGCTCATGGATACATTCCTTCCAAA 
METSerG 1 yLeuG 1 uCysG InAl aTrpAspSerG 1 nSerProH i sA 1 aH i s»G 1 y Tyr 1 1 eProSerLys 
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205 " " 

6S1 TTTCCflRftCAftGPflCCTGftABfifl(5ftRTTflCTGTC6TftflCCCCGAGAGGGfiSCT6CG6CCTTGGT6JTTC- 

PheProflsnLysAsnUeuLysLysAsnTyrCysftrgAsnProGluftrgSXaLeuftrgProTrpCysPhe 
Eco47III 

228 I 

760 PCCflCCGflCCCCARCftfiGCGCTGGGPflCTTTGTGACflTCCCCCGCTGCfiCAftCRCCTCCACCftTCTTCT 

ThrThrRspProRsnLysftrgTrpGluLeuCysflspUeProRrgCysThT^ThrProProProSerSer 

25 I 

829 GGTCCCRCCTACCRGTGTCTGRRGGGRRCRGGTGARRRCTRTCGCGGGPRTGTGGCTGTTACCGTGTCC 
GlyProThrTyrGlnCysLeuLysGiyThrGlyGluAsnTyrArgGLyAsnValAlaValThrValSer 

ApaLI 

274 I 
898 6GGCACACCTGTCRGCACTGGAGT6CACAGACCCCTCACACACATAACAG6ACACCAGAAAACTTCCCC 
GXyHisThrCysGlnHisTrpSerAlaGlnThrProHisThrHisAsnArgThrProGluRsnPhePro 

ApalNcol 

297 I I 

9G7 TGCAAAAATTTGGATGAAAACTACTGCCGCAATCCTGACGGRAAAAGGGCCCCATGGTGCCATACAACC 
CysLysAsnLcuflspGlMAsnTyrCysArgAsnProAspGlyLysArgAlaProTrpCysHisThrThr 

Seal 

320 I 
1036 AACAGCCAAGTGCGGTGGGAGTACTGTAAGATACCGTCCT6TGACTCCTCCCCAGTATCCACG6AACAR 
AsnSerGlnValArgTrpGluTyrCysLysIleProSerCysAspSerSerProValSerThrGluGIn 

Ncol 

343 I 
IieS TTGGCTCCCACAGCACCACCTGAGCTAACCCCT6TG6TCCAGGACTGCTACCATGGTGATGGACAGAGC 
Le uA 1 aProThr AX aProProGX uLeuThrProVaX Val G X nAspCy sTyrHi sG lyAspG 1 yG 1 nSer* 

366 

H 7 4 TACCGAGGCACATCCTCCACCACCACCACAGG AAAG AAGTGTCAGTCTTGGTCATCT ATGACACCACAC . 
TyrAr gG 1 yThrSerSerThrThrThrthrGl yLysLy sCysG 1 nSerTr pSerSerMETThrProH i s 

Pst I 

389 I 
1243 CGGCACCAGAAGACCCCAGAARACTACCCAAATGCTGGCCTGACAATGAACTACTGCAGGAATCCAGAT 
AvgH i sG 1 nLysThrProGX uAsnTy rProAsnAl aG 1 yLeuThrMETftsnTynCy sflr gflsnProfls p 

Seal 

412 I 
1312 GCCGATAAAGGCCCCT6GTGTTTTACCACAGACCCCAGCGTCAGGTGGGAGTACTGCAACCTG AAfiAAA 
AlaAspLysGlyProTrpCysPheThrThrAspProSerValArgTrpGluTyrCysAsnLeuLysLys 

435 

1381 TGCTCAGGRACA6AAGC6AGTGTTGTAGCACCTCCGCCTGTTGTCCTGCTTCCAGATGTAGAGACTCCT 
CysSerGlyThrGluAlaSerValValAlaProProProValValLeuLeuProAspValGluThrPro 

458 

1450 TCCGARGAAGACTGTATGTTTGGGAATGGGAAAGGATACCGAGGCAAGAGGGCGACeACTGTTACTGGG 
SerGluGluAspCysMETPheGlyAsnGlyLysGlyTyrArgGlyLysArgAXaThrThrValThrGly 

481 

1519 ACGCCATGCCAGGACTGGGCTGCCCAGGAGCCCCATAGACACAGCATTTTCACTCCAGAGACAAATCCA 
ThrProCysGlnAspTrpAlaAlaGlnGluProHisArgHisSerllePheThrProGluThrAsnPro 

504 

1588 CGGGCGGGTCTGGAAAAAAATTACTGCCGTAACCCTGATGGTGATGTAGGTG6TCCGTGGTGCTACACG 
ArgRlaGlyLeuGl uLy s AsnTy rCy sArg AsnPro As pG 1 yAspVa 1 S 1 yG 1 y ProTr pCy sT y rTh r 

527 

1G57 ACARRTCCAAGAAAACTTTACGACTACTGTGATGTCCCTCAGTGTGCGGCCCCTTCATTTGATTGTGGG 
ThrAsnProArgLysLeuTyrAspTyrCysAspValProGlnCysAXaAXaProSerPheAspCysGly 
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1736 ftSyCCTCftflGTGGflGCCGflflGRflftTGTCCTGGAAGGGTTGTAGGGGGGTGTGTGGCCCflCCCnCATTCC 
LysProG 1 nVa 1G1 uProLy sLy sCy sProGl yArgValValG 1 yG 1 yCy sVa 1 ft 1 aH i sProH i sSer 

EcoRV 

573 I 
1795 TGGCCCTGGCAfiGTCftGTCTTflGftftCftflGGTTTGGflflTGCflCTTCTGTGGflGGCftCCTTGRTftTCCCCA 
Tr pProTr pG 1 nVa 1 SerLe u Ar gThr Ar gPh eG 1 y METH i sPh eCy sG 1 yG 1 y ThrLeu 1 1 eSerPr o 

StuI 

596 601 I 

1864 GAGTGGGTGTTGRCTGCTGCCCflCTGCTTGGftGAAGTCCCCAAGGCCTTCATCCTRCAftGGTCftTCCTG 
GluTrpV/alLeuThrAlaAlaHisCysLeuGluLysSerProArgProSerSerTyrLysVallleLeu 

ApaLI 
619 

1933 GGTGCACACCAAGAAGTGAATCTCGAACCSCATG7TCAGGAAATAGAADTSTCTAGGCTGTTCTTGGAG 
G 1 yA 1 aK i sG 1 nG 1 uVa 1 AsnLeuG 1 uProH isValG lnGluI leGl uVa 1 Ser*Ar g Le uPh eLe uG 1 u 

642 

2002 CCCACACGAAAAGATATTGCCTTGCTAAAGCTAAGCAGTCCTGCCGTCATCAC.TGACAAAGTAATCCCA 
ProThrArgLysAspIleAlaLeuLeuLysLeuSerSerProAlaVal I leThrAspLysVal i lePro' 

665 

2071 GCTTGTCTGCCATCCCCAAATTATGTGGTCGCTGACCGGACCGAATGTTTCATCACTGGCTGGG6AGAA 
A I aCysLeuProSerProAsnTyrVa 1 Val A 1 aAspAr gThrG 1 uCysPhe.I leThrG 1 yTr pG lyG 1 u 

688 

3 1 AO ACCCA AGGTACTTTTGGAGCTGGCCTTCTCAAGGA AGCCCAGCTCCCTGTGATTGAGAATA A AGTGT6C 
Th rG X nG 1 yThrPh eG 1 yftl aGl yLeuLe uLy sG 1 uA 1 aG 1 nLeuProVa 1 1 1 eG 1 u AsnLy sVa 1 Cy s 

7 I i 

2209 AATCGCTAT6AGTTTCTGAATGGAAGAGTCCAATCCACCGAACTCTGTGCTGGGCATTTGGCCGGAGGC 
Asn Ar gTyrG 1 uPheLeuAsnG 1 y Arg Va 1 G 1 nSerThr G 1 uLe uCys AlsGlyHi sLeuA laGlyGly 

734 

2278 ACTGACA6TTGCCAGGGTGACA6TGGAGGTCCTCTGGTTTGCTTCGAGAA6GACAAATACATTTTACAA 
ThrAspSerCysGinGlyAspSerGlyGlyProLeuVaXCysPheGluLysAspLysTyrlleLeuGln 

ApaLI 

757 I 
2347* GGAGTCACTTCTTGGGGTCTTGGCTGTGCACGCCCCAATAAGCCT6GTGTCTATGTTCGTGTTTCAA6G 
GlyValThrSerTrpGlyLeuGlyCysAlaArgProAsnLysProGiyValTyrValArgValSerArg 

780 79 1 

2416 TTTGTTACTTGGATTGA66GAGT6ATGAGAAATAATTAATTGGACGGGAGACAGAGTGACGCACTGACT 
PheValThrTrpIleGluGlyValMETArgAsnAsn 

SphI 
I 

24BS CACCTA6AGGCTGGAAC6AGGGTA6GGATTTAGCATGCTGGAAATAACTGGCA6TAATCAAACGAAGAC 



2554 ACTGTCCCCA6CTACCAGCTACGCCAAACCTCGGCATTTTTTGTGTTATTTTCTGACTGCTGGATTCTG 



2623 TAGTAAGGTGACATAGCTATGACATTTGTTAAAAATAPACTCTGTACTTAACTTTGA 
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GAATTCCGCA 


GACATTCCAC 


CCAAGACCAT 


TGGGCTCCCA 


CCTCTACTCT 


TTTGCCAGTT 


60 


AATGAATAGG 


CAGGAATTTC 


ACTGCCTGGA 


AAGAGGAACA 


ATGCTTTCTG 


GTCCTTATTT 


120 


CACATCTAAA 


ATAGAGAGGT 


CAATTGATTT 


ATTCCTAAAT 


ATCTTTGAAC 


ACTAAAATAG 


180 


AAGTTTTACA 


GCATATATAC 


TACCTGGTTG 


CTCTAGACTT 


AAGCCAGGGA 


AAAGTACAGA 


240 


TTCAACATTT 


AAAATTGAGA 


TAGACGCTTT 


CCACTTAATG 


CTACCAGTCT 


TGCTTTATTT 


300 


CATGAGAATG 


AGAATATAAT 


AATATGGCAT 


ACGTTCATTT 


GGGGGAAAGA 


TTGATGTCTT 


360 


ATAACATAAT 


TTATAATTAC 


AGAAAACATG 


TGAGTTCACT 


GGGAATAAAT 


AAATTTTGAA 


420 


GATAATAAGA 


TACTTTCACT 


TATGTCATAA TTTCTATGTC 


ATTTGGTGTA 


GGATGTAGAG 


480 


ATATTAACGT 


TTACACCTAA 


CTCAAGTTTG 


TCATCTAAGA 


CCTGAAAGGG 


TTTTGTCTAT 


540 


CAGCTGCACC 


CCTGGGTAGA 


GACACAACCT 


TGGGGAAGGC 


CTCAGCCCCA 


TCCCTCGTAC 


600 


AGCAGGAATG 


AGAACAGCCC 


TGCCTGTTGG 


GAAGCTTGAG 


GGAGGCTATG 


GACGTGCAGC 


660 


GCTTGGCAGA 


AGGTCTCGTC 


ATGGAAGGTT 


CCAGCAAATG 


TGAGATACTT 


TTATGATTTC 


720 


ATTTTCTCCA 


AAAGAAAGGG 


AATAAGAGAA 


GAGGGGAGGA 


AATAAGACTA 


ATTGCGAGAG 
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ATAAAGTACA 


AGGGTGAGGG 


AAGGAATAAG 


GAGACATGAC 


GGCAGCGTGG 


AGCAGCCGAG 


840 


GGGGGAGATT 


GCTTTCACCA 


CTTCCCAGCA 


TCTATTGCAG 


ATTCCACCCT 


CAAACATTTT 


900 


GTAAGGACTC 


TTTATTCAAG GTAACGTTTG 


AACCCTGCTG 


AGCCAGTGGC 


ATGGGTCYCT 


960 


GAGAGAATCA 


TTAACTTAAT 


TTGACTATCT 


GGTTTGTGGA 
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CTCATGTAAG 
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TCAACAACAT 


CCTGGGATTG 


GGACCCACTT 
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CCCAAAATGG 
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AACATAAGGA AGTGGTTCTT 
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TTTTCTCCCA 


CAATGTAGTA 


AAAATACATA 


TGCCATGGCT 
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TTATGTGCAA 


TTCATTTAAT 
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TGAAACTTCC 


AGTTGAAAAT 


CTTGTATAAG 
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ATTGAGGAAT 


TC 
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CCCCAGTGTC 


TTTAGTTGCC 


ATCTTTATTT 


ATGTCCAAAT 


GCCCGACTGT .GTGTTCTTAA 
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CTAAACATTT 


TGATTCATAG 


CTACCCATTC 


TACTTCCAGT 


AAACAGAAAG 


TTTTATTTGG 
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TTAATGCTAA 


CCAAATAGAT 


TAAAAGGAAG 


TCATGACAAT 


TAGACATTGA 


CATTGATTTA 
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CTGACCATTT 


ATTCCACTTG 


GATCTCCCAC 


CTCTAGGTCA 
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CTGGATGACT 
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ATGTGAATAC 
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GTGTCACTAA 


GAAGCAGCTG 
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GTATAGAAGA 


ATGTGCAGCA 


AAATGTG AGG 1 *AGGACGAAGA 


ATTCACCTGC 


AC^GTATTTCC 
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ATTGTCGTTG 


CACCTACGCA 


GGAATCTGTA 


ATTCAGATGG 


CAAGTAATTT 


ACTCACAAAT 


420 


TTATTAATGA 


TTTAAGAGGA 


AAGAGAAATT 


TATGGAGCCA 


GAGTTTGGAA 


CTATATTTGC 


480 


TCACAGTATG 


TGAAGCCATA 


CTAACAGCTT 


CTTGTTAAGG 


TTTATTGGAG 


TCTTTGTTAG 


540 


AAAAATACCC 


TCAAAGGAAG 


TTATTTGTTT 


TTACACCGGA 


CACAAACATT 


AGCAGTTATT 


600 


GTTCTGAGCT 


CCAGTTTTCA 


ACATCATCAT 


CAGTAAATGT 


TTGTTGAGGA 


TCAGGTGAAT 


660 


GAAAGTGTCC 


TAGATAGATC 


TGAGCAATGA 


CTTATAGCTA 


CAAGATCCAG 


TGCCTGCCCT 


720 


TTAGTATTTA 


AGGTGTAGTC 


AAAGAAACTG 


GATATAATGT 


TAAAAAAAAA 


AAAAAGACAG 


780 


CCCAAGTGAG 


GTACAGGCAT 


AATCAATGCA 


TGCTCTACCC 


AGATCCAGAA 


GAAAGAACAG 


840 


TGCCTAAGGT 


TGAGGCAGCT 


AGAGAAGGCT 


CAGGGAGGAG 


GTGGGAACTG 


AGCTGGGTTT 


900 


GGAGTTGAGA 


GAGCTCTTGA 


CAAGCACCAG 


GAAGGCAGGG 


GAAGATGCGG 


CCCTGCACCT 


960 


TCTGAGGGGG 


ACCATTAAGA 


GATGAAGTTG 


ACTAAAGCAG 


AGACTTTGTG 


TAGGTGACGG 


1020 


GCTTGGGAAG 


GTAGCTATGG 


AATCCAGACT 


GAGCACCCAT 


AGCAGGACCA 


CGGGATGGAG 


1080 






GGGTGGGGTG 


GAATGTGGAG 


CAGAGGTTCA 


GGGGAACTGA 


i i a n 


TCAGAGTTGG 


GAGGTCATGG 


AGACGGACTA 


TCTTGGCGAA 


TGGGTTCAAA 


GCAACCAGAG 
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TTGCTTCTTT 


CCAACCCAAA 


AACAAAAATT 


AAGAAGATGA 


GTGAAGAAGA 


AGTAAAGCAG 


1260 


TTGAAACAGG 


AAGAAAGGGA 


AAATTATGAG 


GGAGGGAAGG 


TAAGGGCAGA 


TAAGATTTGC 


1320 


TGCCACGTTG 


GTGTATTTTG 


TTCAGTACTT 


CATCGATGCC 


ATGCCCAAAT 


AACTGAAAGA 


1380 


GGCAGCAATT 


CTGAACTCTC 


TGGTCCCTCA 


AGATATTCAA 


TGATCTTTAG 


CATGTCTCAC 


1440 


TTATTAATAA 


ACATTTGTTT 


TCTTTAAATA 


AAGAAAAATA 


CTTATTGGAT 


TTCCTGCTTC 


1500 


GTTCTGCAGG 


GCATTCCAAT 


ATCACAGTAA 


AGAGCAACAA 


TGTGTGATAA 


TGGCTGAAAA 


1560 


CAGGAAGTCC 


TCCATAATCA 


TTAGGATGAG 


lit 

AGATGTAGTT 


TTATTTGAAA 


AGAAAGGTGA 


1620 
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GTACATTTTC TTCCTCCTCC TCCTACTGTC CTCCCCATCC TCCCACTCTT CCTCTTTCTC 1680 
TATTCTATCT TTAATTTATA AGACCAGAGG AGGAAGGCAC TATCGTGTTA TAAAACTGAA 1740 
TTC 1743 
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CCAAGACCTC TGGCTGCACT GTGCCCCGTG GTGTCCCCAG CATCCTGGTG GGGCTCGATA 60 

CACAGAGAGC TCATAAGTAG CATTTGAATA CATGAATCAA AGAATGGCTC AGTTTACTGC 12 0 

AGCCTTTTTG CAGATGCAAA AGATGATCTT TTAGAAAGCA GAAACAGGGG GTCTGGTGCA 180 

TGAGATCTTT TTCTCAACGT GACTATGCTG TGCAGACCTT CATGTGGTGT CTTGTGAAAG 24 0 



ACTTTGACCA CTGTGTGGAC TTCCCTTCAG TGTATCTCTC AGAGTGCAAG ACTGGGAATG 300 

GAAAGAATTA CAGAGGGACG ATGTCCAAAA CAAAAAATGG CATCACCTGT CAAAAATGGA 3 60 

| IV 

GTTCCACTTC TCCCCACAGA CCTAG GTAAG ACATTCCCTT TCATCTTTGT GTTCATCTAC 4 20 

TGTAAAGTTG TCCCTCTGTG TCTGTGAGGG ATTGGTTCCA GGACCCCTGT GGCTACCAAA 4 80 

ATCCATGCTT CTCAAGTCCC TTATATAAAA TGGTGCAGTA TTTGCATATA ACCTACATAC 54 0 

CTTCTCTTGT ATAATCCCTA ATATAATGTA AATGCTATTT AATCGTTGTT ATACTGTATT 600 

GTTTTTATTT GTATTATGTT TTATTGTCAT ATTGTTATTT TCTGTCATCT TTTTCAAGTC 660 

TTTTCCATCC ACAGTTGGTT GAATTTGTGG ATCTGGAACC CCATGGATAC AGAGGGCCAA 720 



CTGTATTTAG 


GATAATTTCA 


TCACTTTTAA 


TTCAAACCAC 


AATATGTGAA 


TAAGCAGATA 


780 


GAAAGAATCA 


AAAAGATGTC 


GATGTTCAAC 


TATTTTTGGC 


ACCATAGTAG 


AACATGGTTG 


84 0 


CTTTCTATTT 


TTTCTTGGAT 


ATGGAGGTTT 


CTTGAAGACC 


TAGAACATAG 


AAGAATGCCT 


900 


AGTTTAAAAA 


AAATCAATGA 


AACTATGAGT 


TTTAGGCCAA 


ATCTGAGAAA 


AGATCAAAGA 


960 


TGACTATGTT 


TGGGACTGAA 


GTAAGCATAT 


CAAGTTAGAA 


CTCTCATCAC 


ATGTTCGACT 


1020 


CAAATTGTGG 


AGCAAAAGAG 


TAAATAAGAT 


ATAAAAATGA 


AAATGAAGAT 


ACGTGAAATT 


1080 


CAAATGTTGC 


AACTTGCCTA 


TTATTTATTT 


TAGTGCATTT 


TTTTGTACTT 


TTCCCAGTTT 


1140 


GGTGTTAGGT 


GGCATTAAGT 


TCTCAGTAAT 


GACGCTTATC 


AAATAGGAAC 


TTAGTGCTTG 


1200 


TTACTCACCT 


TTATCCATTC 


CCCCAACACT 


CAACAAATTG 


CCTTTGCTAT 


ATCCCTATGA 


1260 


GATGAGCAGA 


TCAAATATTC 


CCCGTGAGTT 


AATGAAAACT 


GATTCAACCA 


AATGGCAAAG 


1320 


TCAGAGACTA 


TCGGGGGCCA 


TGGAGACACT 


CTGGGCCATT 


TTTATGAGGT 


AGTCTAGGCT 


1380 


CATCTTTATG 


AGGGAACTGA 


GGTCTCGGGG 


GGTGGGGG 
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CCTCATAGCT ATTTACACTT AGGCAAGTTT TGTTTTGTTT TGTTTTACGT TGCCACTCAG 60 

TTTTCTCATC TGTAAAATAG GG ATAATAAC ACCTTCCTCA AATGGTTTTA TTAGGACTAA 120 

AAGAGAGAAT GTGTGGAAAG ATGTTAGTGG AATTCCTGGC AGATAGTTCA CATGGACAAA 180 

ATGGTATTAA CTACAAAAAT TTTTACAGAG AAAACGGTAA CTGACAAAAG CAGGTGTTTG 24 0 

GAATGAATTA AGACCATGGC AGCCTTTTGA GGCCTTTATA TTTCTCCTGA CTGTGCAATA 3 00 

AAAATATTTT GGCTCTCTAA GACTTGGCTG TCACAGTAGC AATGGTAATA TTAGCTACTG 360 

TGCCAGAAGC AGCCTATCAA TAGAGAAATT GAAAATCTGA CCACACAAAT GCTGCAGCAC 420 

CCAGCTGAAA TGCATTTGGA TGACAATCTC AGATGGGAAT CGAGAGCATC TCCTTCTGCC 4 80 
TTGCTAATAG CAAGCTGATT TTTAGAATAT AGTCTAAGTG CTTCTTTTCC ATCCTCCCCA 540 
GATTCTCACC TGCTACACAC CCCTCAGAGG GACTGGAGGA GAACTACTGC AGGAATCCAG 600 
ACAACGATCC GCAGGGGCCC TGGTGCTATA CTACTGATCC AGAAAAGAGA TATGACTACT 660 

GCGACATTCT TGAGTGTGAA GG TCAGGAGT GGTTCTAGAA AATGTTTTCA TTTCTGCCCT 720 

TCACCTGTAA AATAATTTGT TGTAAAGCCC CTTCCCACAG GGATGTTATT AATAATTGAG 7 80 

TAACGTATTC ACCTCTCGGA AAGAAGCAAA ACCCCAGAAT TAACCTGAAT TTTTTTTTTT 84 0 

TTCTGAGACA GAGTTTTGCT CTCGTTGCCC AGGCTAGAGT GCAACCGTGC AATCTCGGCT 900 
CACCACAACC TCCGCCTCCG GGTTCAAGAG ATTCTGCTAC CTCAGCCTCC CAAGTAGCTG 960 

GGATTACAGG CATGTGCCAC CATGCCTGGC TAATTTTATA TTTTTAGTAG AGACAGGGTT 1020 

TCTCCACGTA GGTCAGGCTG GTCTTGAACT CTCGACCTCA GGTGATCCGC CTGCCTCAGC 1080 

CTCTCAAAGT GCTGGGATTA CAGGCATGAG CACCATGCCC AGCAGACCTG AATTATTTTT 1140 

ATTAAAATGT TACATCAACA TGTACAAATA TAAAACTACA TCTAAACTCT AAGTACAAAC 1200 

TTCTTATGCT TACAACTCTT ACACAGTG 1228 
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TTTTAAAAGA 


TCATTATTGA 


AATGAAGATG 


CCAAATATTG 


AAAACTCCTA 


ATGGAGAACG 


60 


TAGACTCCTG 


GGAATATATG 


CACCCTTGGC 


TCCCCACTGG 


CCTGTGCATC 


CCGGTCTAAG 


120 


GACATGGCAT 


CATGGAAATT 


CTGAACTTGG 


TCATGACTAC 


AATAGTTGAG 


GGAGTATTGA 


180 


CTAAAATATG 


TGAATGTTAC 


GGTTTAAAAG 


GAAAATGACA 


TTTGGATTAT 


GCTAGAAAAT 
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CCTGAGTCCT 


TATTGCCAAT 


TTTATTGCCA 


AGTGCCTGTT 


GTGAATTACA 


TCGGAATGAG 


300 


AGGCAAGTCG 


CACTTAAGTG 


AGTAGGATTC 


TGGTTTTTAC 


TCTCTATTTT 
TGTCCAGAGG 


GCTTCATCCA 
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TTTCAGTTTT 


CTTCTTCCTC 


TCTGTCCTTC 


CTTCCCACTC 


AATGTATGCA 
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TTGCAGTGGA 


GAAAACTATG 


ACGGCAAAAT 


TTCCAAGACC 


ATGTCTGGAC 


TGGAATGCCA 


480 


GGCCTGGGAC 


TCTCAGAGCC 


CACACGCTCA 


VI 

TGGATACATT 


CCTTCCAAGT 


AAGTCTCACT 


540 


GGGAAAAACA 


TTCCATGTTT 


AATTAAGGCT 


CTGCAGCTCT 


ATCAGACATT 


TGCTGTCATT 


600 


TAGATATTTT 


AGCATTCCTC 


AAGAAGTGAA 


CGCCTGATGT 


TTTTAATTTC 


AAAGCTAACC 


660 


TCCTCCCACA 


ATATTGCAAG 


TGAAATACGC 


ATTCTTGCTG 


CTCAAAATAT 


GGTCCACGGG 
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TCAGCAGCAG 


GGATGTTTTC 


TGAGAGTTTG 


TTAGAAATCC 


AGAA 
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CCAAAATGAT 


AAGGTCACTG 


ATTCTGTTGA 


FIG.2F 

GTGATTTTTA CACATGTAAA 


CTGTTAGAAA 


60 


AACAGTGCTT 


GGCAGCCGGG 


CATGGTGGCA 


CATGCTGTAG 


TCCTAGTTAC 


CTAAAGGGCT 


120 


GAAGCGGGAG 


GATTGCTTGA 


GTGAGTTCAA 


GGAGTTCAAG 


GCAAGCCTGG 


GCAATAAGTG 


180 


GGACCCTGTC 


TCTAAAAACA 


AACAAAAAAA 


AGAAAGTCCT 


TGGAATACAG 


GGCCAACCTT 


240 


GTTTCCTAGT 


TGCCATCTCT 


GAACACAGCC 


TTCATCTGAT 


TACCTCCTCC 


ATGCCCGACT 


300 


GTGCCTAGCA 


CACAGCAGGT 


GCTCAATGTT 


TGCTCTTGAA 


AAAGAGTCTT 
GATTTCCAAA 


ATCCATGAAT 


360 


GTAAATGTTC 


AGTGCTACTA 


AAATCTTTCT 


TGTCCATTCA 


CAAGAACCTG 


420 


AAGAAGAATT 


ACTGTCGTAA 


CCCCGATAGG 


GAGCTGCGGC 


CTTGGTGTTT 


CACCACCGAC 
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CCCAACAAGC 


GCTGGGAACT 


VII J, 
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GATTCCAGGA 


TTTGGACCTG 


CCCTGTTCTT 


GAAAJLUAAAA 


VaAAAACATGT 


GTCAGTGCCT 


600 


GAGTGCAGCC 


TCTGAAAAGT 


G AC CTA CAAG 


TCCTATGGGA 


TGTTATTGGT 


CTTTATTTTA 


660 


TTGCTGGTTT 


AAAACAGTTA 


TGGTTATTGG 


TTACTGTGGG 


TGATTGATCA 


GAGCGTCCAT 
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TTATCATGTT 


TTTCTTTCTT 


TGCAACTGAA 


ACTTCTGCCT 


CAGGAGTTCA 


CTGAAATGTA 


780 


GGCTTTAGGT 


GTTGTTCATC 


CTATTCTCTC 


TGTGCTAAAG 


GGAAATCAGA 


CCCATGCTCT 


840 


CTGACACATG 


GATTTCATTT 


TCAACCAGAG 


TTCTAATAGT 


TGTTTTGTAA 


ACAAAGAGTG 


900 


TCTTTCTTTA 


CAATGTTCAG 


GTCTGTGGGT 


GTCCAGTTTT 


TCCACCTTGG 


GGAGCAGAGG 


960 


GTGAGTGGTG 


GGGGTGGGGA 


AGAGTTCAAG 


AGGAGAAGAT 


GAAATGGCAG 


ACCTAGTAGA 


1020 


AATGATGTGG 


AGTAAACAAT 


TTTATCATAT 


TTTCCTCTCT 


GAGAATTTGA 


AGCAAAGGAT 


1080 


TACACACTAA 


GAGAAATACA 


GGCATGAAAG 


GTTAAAAAGG 


ATTCAGTGAG 


GGTTGGCCTC 


1140 


CCCTCCTTTC 


CTCTGACATG 


TGTCCTTTGA 


AAGCGGAAGT 


TCCTCAGGCA 


TTCTCCCTTT 


1200 


TTATGAATAT 


TAATTTCTCT 


TTTTTTTCAG 


TTTCTCTTTT 


TGTCATCTTT 


TTTCCTCAAG 


1260 


AATATCTTGA 


TTTCTGGATG 


CACACACTTT 


TCCTTGGAGG 


TGTTTTTTGC 


CTTCTTTCCA 


1320 


TGGACTCTTT 


CCCTGTTGTT 


TGGCTTTTAT 


GGCATGTTGG 


GTGCCATTCA 


GTCATGTCTA 


1380 


CTCAGTGAAT 


AATTTATTCT 


TCAGGAAAGA 


GAGTGGACCT 


TTGGTGTATG 


TGAGAATTCG 


1440 


GGGTGTGAGG 


TGACACGTGT 


TGATACTTAC 


CAGGTAGGAA 


GAACTGAGCA 


AAGAGAACAT 
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ATTCCAAGAG AGGCTGAGCA GAAGCCAAGA CAGGCCAGAA CACCCTGCAG CCATCCTCCT 1680 

TAACATCCAT CTGTGCATTC TCTATTTTAA AATTATTCAT TGTAGGGCTG GGCACGGTGG 17 4 0 

CTCACGCCTG TAATCCCAGC ACTTCCGGAG GCCGAGGTGG GTGGATCACG AGGTCAGGAG 1800 

TTCAAGACCA ACCTGGCCAA TATGATGAAA CCCCACCTCT ACTAAAAATA CAAAAAAATT 1860 

AGCCAGTTGT GGTGACACGC ACCTGTAGTC TGAGCTACTC GGGAGGCTGA GGCAGGAGAA 1920 

TGACTTGAAC CCAGGAGGCA GAGGTTGCAG TGAGCTGAGA TCGTGCCACT GACTCCAGCC 1980 

TGGGCGACAG AGCGAGACTC CGTCTCAAAA AATATATATA TTCATTGTAA CTTATTTTGC 204 0 

CCATTCAAG C AACACCTCCA CCATCTTCTG GTCCCACCTA CCAGTGTCTG AAGGGAACAG 2100 

GTGAAAACTA TCGCGGGAAT GTGGCTGTTA CCGTGTCCGG GCACACCTGT CAGCACTGGA 2160 

VIII | 
GTGCACAGAC CCCTCACACA CATAACAGGA CACCAGAAAA CTTTCCCTGC AA GTAAGTCC 2220 

CCTCCAGTCT CATTCTGCTG CTATGGAATG TGAAATCCCA TTGACTTTGC CTTAGTTTTA 2280 

GTTACTGTAG GAACGCAGGA TAAAGTATTC TGGAAGAAAA ACTGATCTAG TCATAAGTAA 234 0 

AGGAAATGAA CTTTAGCACG TTTTTTCCCG TAACGGTTGT TCTCAAAGCG TGGTTCCCTA 24 00 

GACTTTTTTC TTTTTGGAAA GCTAAACTCA CAATCACTTC TTTTTCA GAA ATTTGGATGA 24 60 

AAACTACTGC CGCAATCCTG ACGGAAAAAG GGCCCCATGG TGCCATACAA CCAACAGCCA 2520 

IX 

AGTGCGGTGG GAGTACTGTA AGATACCGTC CTGTGACTCC TCCCCAGTAT CCACGGAACA 2530 
ATTGGCTCCC ACAG GTAAGC AAGGGTATGG GAGCTTACTG AGGGCCCAAG TTTTCTCCTT 2640 
ATTTTTGTAT ACCAGTGGCA TCATCACAAT ATACAGTAGC TTTGTAAGTT TAATGCTATT 2700 
GTGGTCAGAA AGCCTGCCCT TATGATTTCA GTTTTTTTAG ATTTGTTGAG GTTTGTTTTA 2760 
TGGTTCAGAA TATAGCCATC TTGGTGAATG TTTCATGTGC TCTTGAAAAG AATGTGTCTT 2820 
CTGCGGTTGT TGGGTGGGGT GTTCCCTCAA GGTCATTTAG* GTGAAGTTGG TTGCTGGTGT 2880 
TCTTCTGTAT CCTTACTGAT TGTCTGTCTC CTCCTTCATT GACTACTGTG GATGAATGGT 294 0 
GATGTGTCCA ACTTTAACTG TAAATTAGTC TATTTCTCTT TTAGATCGTA ACTCTTTTGT 3000 
ATATTTTGAA GCTCTTTTGT TAGGCACATA TGTATTTAGG ATGGTTATGT CTTCTAGATG 3060 
AAAGGACCCC TTTATCTTTA TGTAATGTTT CTTCTTATCT CTGGGAATAT TTCTTCTTCT 3120 
GAAGTTCTGA ACTCTCTTTA TGGTGATATA AATACAGTCT CACAGCTCTA TTTTCACTAG 3180 
TATTTGTGTG ATATATCTTT TAAATTTGTA TGATATATCT TTTAAATTTA TCTGAGCTTT 324 0 
TAAATTGAGA TGTTCAAACC ATTTGCATTC ATGCAATTGT TAATAGAGTT GAATTTACAT 33 00 
CTACCATCAA GTTAGTTATT TCTCTTTGTC CCATTTAAAC TTTGTTCCTT TTTTCATCTT 3360 
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TTTCTGCCTT CATTTAGATT GAGTTTATCT CCACTACTCA CTTAGTAAAT TAATTTTTAA 34 20 
TGGTTTTAGT ATTTTCCACA ATGTTTATAA TATACATTTT TGACTTTTCA CATTCCACCT 34 80 
TCAAATGATA TCATTCTACT TGACATATGA ATCCTTACAT CATTGCAGTT CTACTTCCTC 3 54 0 
CCTCCCAAAA TGCTATACTA TTACTCTTTG TAATAGAAGC TTACTTCTAC TATGTCACAG 3600 
ATCTCACAAT ACATTGACAC TATTTTTGCC CTAATAGTTG TGTTTTAAAG TGATCAAGAA 3660 
TAAAACTATT TTAAATATTT TCTTTATTTA TTTATTTTAC CATTTCTGGT GCTTCTCATC 3720 
TACTGGGGTA GATCTCAATT TCCATCTGGT GTCAGTTTCT TTCTGTGAAA AACAACTTTT 3780 
AGCATTTTTT GTAGCACAGG TCTGCTACTG CTGAAGTCTT TCAGATTTTG AGTGTCTGAA 384 0 
AAAGTATTTT GCCTTCAGTT TTTAAAAGTA ATTTTGCTGA ACGTAGATAC TGGGTTGAGA 3900 
GTTTCATTAC TTGCAACACT TTAATGATGA TGTTCCATTA TCTTCTGTTT TAAATAGTTT 3960 
GACTAGTAAT CTGATCTTTG TTCCTATGTT TTCAATAGGT CATTTTTCTC TGACTACCTT 4 020 
TAAGATTTTC TCATCTTTGT TTTTCAACAG TTCGACTATG ATGTGTTTAT TATTAATTTC 4080 
TTTGTGTTTA ATCTGCTTGA GGTATTCTGA GTTCCTAGAT TTGTAGATTG TTGATTTTTT 414 0 
TCTTTTCTCT TTTTTCTTTT CTTTTCTTTT TTTTTTTTTT TTTTTTGAGA TGGAGCCTCA 4200 
CTCTGTCACC CAGGCTGGAG TGCAGTGGCG CAATCTCGGC TCACTGCAAA CTCCACCTCC 4260 
CAGGTTCAAG TGATTCTCCT GCTTCAGCCT CCTGAGGAGC TGGGACTACA AGCATGTGCC 4320 
ACCAGGCCCA GCTAATTTTT GTATTTTTGG TAGAGACAGA GTTTCGCCAT GTTGGCCAGA 4 380 
CTGGTCTCAA ACTTCTGACC TCAGACGGTC CATCACCTTG GCCTTCCAAA GTGCTGACAG 4440 
TACAGGTGTG AGCAACCGTG CCCAGCCTAG ATTGTTGATT TTCATTGTCC TTGTAAAATT 4 500 
CATAGCCATT ATCTGTTCAA ACGTTTCTTT TTGCACTTTT CTCTCTCTGT ATTTTCCTTT 4560 
TGGGACTCTA AGTACCACGT GTTTGGGATT CTAAGTACCC ACAACATTCA TGTTGTTTCA 4 620 
TAAATCTTGT AAGCTTGTTC TCTTTTTTTT TCAGTAACTC TTTTTCATTC TTTGTGTTGG 4680 
TTTGGATAAG TTCTGGTAAC CTATTTCCAA GTTTATGGAT TATTTTTTCA GTTGTTTCTA 4740 
GTCATCTCCT CAGCCCATTG AGAGAATTCT TCATCTCTGA TATTATGACT TTTTTTCTAG 4800 
CATTTTCATG TTACTCTTTT CTATAGTTTC CATCTTTGCT GAAATTCTCT ACCTATCTAT 4 860 
GCATACTGTC CACCGTTACA ACAAGATCCT TTAACATACT AATGTAGGTA TCACACAATC 4920 
CCAATCTGAT AGTTTCCAGA TGGCGTCTTC TCTAAGTCTG GCTCTCTGGA TTGCTTTATT 4980 
ATTCAACAGT GGCTTTTTGT TCCCCCTTGG GTTTTTTGGT GTGTCTTATA ATTCTTTAAT 504 0 
CAAACACTAG AC ATT AT AAA TAGAAGAACA GTAGAGGTTA CAGTAAATAT TATTTATACT 5100 
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TTGAAATGGA CACCCTTGTC TTGCAAATAT ATATCGTGGA TAATTGAGTC AATGTAGTCA 5160 

CTAGTTTAAC TGAATTGGGA TTTGTGATTG CTAGTTTTAC CTTAAGTGCA CCACAGATAT 5220 

AAATTCCTCC AGTGATGTGC TGCTGCTATC TTTTACTTAG AGTGGGGCCT GGGGTGCTAA 5280 

AGAGTTTTCT CCGTGTTCCT ATCCATTCCC AGATTTCAGC AGTCACTGCA TGCCTGCACT 534 0 

ACAGAGGAGA TATCTTCATA CACATAATCT AACCCCATTG ACACTCGGCT GTTTCTTGTT 54 00 

ACTGAATGCT CACTTTTTGG TGGACGTAGG AGAATACTTA TCTCCCTGGT CTACCTCCCT 54 60 

CTTAGGCCAG TTGAGCACAG CTCGGCTTTG AAAGTAGTGA TTTTTCAGTG TTCTTGTGCC 5520 

TCCTTCTGAT GGAACTTGTA CCTGTGGTGG GTTTGGAAAG AAAGAGTAGT AGGCTTCTGC 5580 

TTCATTGCAA TGCAGGATGT TGGGCACAAG AGGATTCCCT GTAACTTCTC CAAGGGAATA 5640 

AGATTTTTGC CTCCACCACT CTCTGAGAAG CTGTGGATCT TTGCCTGCAG TCCTAGATGC 5700 

AGGACCATCA CCTGCCCTAT CACCCAGAAG CTTTGGTCTT TGGCTTTGTT TGAGGAAGGA 5760 

GCTAGAGAAA TGTGCAAAGC TTTCATGTCT GCCCCCCACT GACAGCCACT CACCACCCAC 5820 

AGCCTGCACT GCCGAATGCA TCCTCCTCTC ATCTGCCCTC GTGTTCTCAT GAACACTCAG 5880 

TAGGGACCCA TAAAAAAGAG CTTGCATGTA AGTGCAATTT CCAATTATAA GTACTCTATC 594 0 

TGTTCTTTCA CACCCAGGTT TTAAATGAAA TATTACTAGG AACTTATTAA TGTTCTAAAA 6000 

TGCTATAAAT CTATTTTTAT GTTAATCTGT CTGCTAATAC AGAAAAGAGA ACAGTCATAA 6060 

TTCTCAGAGG CTACCGTACT GTTTTTGTCA TAAATTGCTT CATGCTTCTT TTTTTTCAGT 6120 

AATTGTTAAG CTTGATTTCT TTTATTTTAA TTTCAG CACC ACCTGAGCTA ACCCCTGTGG 6180 

TCCAGGACTG CTACCATGGT GATGGACAGA GCTACCGAGG CACATCCTCC ACCACCACCA 6240 

X 

CAGGAAAGAA GTGTCAGTCT TGGTCATCTA TGACACCACA CCGGCACCAG AAGACCCCAG 6300 
AAAACTACCC AAATGCG TAT GTCTTTGATT TTTACTGTAA GAGGGGCATC AGCCAACTGA 6360 
AATTTCTGTT AAAAGAGCCA TGCTTCATGC TTCAAGCCAA CTTCCTAGGA CCAAATTTCT 64 20 
CTTAGACCCA GAATGTGTAG AAAAATGTCT CAAGAATCTT GCTTTTGAAG AAAGGGCCTG 54 30 
CGAGAAGAGA AATTTTAGGC TGGCTATTTT TCCTGAGTAG TTTTATGGAT GCAGGAGGAC 6540 
ATCTGGAGGT GATGAGGTCA CATTAATTGA AAGCTCAGGA GTACATATGA GCAAATGCTT 6600 
AGAAACAGTA CCATTCCACA ATGCCCACTA AATATCAGTG CAATATTTCT ACCATAGAAA 6660 
TCTATCATTT TAACCTCCAA CCCCTGAAAT GAAGGTTGAA TTTGCTATTT TTGTCTTGGG 6720 
TCACAAGTAA ATATACTTTA TATATATAAG TATGAATATA TATACACACA TATATATGTA 6780 
TACATATGTG TGCATATATA AATACACACA TATATGAGAT ATACAAGTAT ACATATATAG 68 4 0 
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TGTGTATATA TATGTACACA TATATGTGTG TATATATATG TACACATATA TGTGTGTATA 6900 
TTAGAATATA TATAACATAA ATATGTATAT ATATATATTC TGACCTGTAT AAACACAGTG 6960 
GATCCTGAGC ACCAGTGGCC TGAAAGGATA TGGGTTGCTG GGACATGAAG AACAAAAGCA .7020 
GGATACGCAG ATGCTGAACA GCGAAAGAGG CCATTAGATG AACAGAAAAC CAGGTCTAAC 7080 
AAGGACAGCT TTTCTTCCAT AAATGAGTAC ACAATATATG GAAAAAACTA TTTTTACATA 7140 
TTGGAGAACA GATAAACTGA GATAATTTAG AAAGGGAATC AAATGAGATC AACCCAATAA 7200 
CTACCTTGGC TTTGTXCCTG GAGACTTCCT GGGCTGAAGA ACAAGGAGAT GGAGCCCAAG 72 60 
CCGACCACAG CAGTCTTGCT GAACTGAGGA AGGAGACTGG AGTTGGGATT ACTAAAACAG 7320 
CTGAGATTTT CTAGGCTAGG TAATAACATG AAAGGAAACA TTGTGGAGGA AAGCAGCTCC 7380 
AGGAATGTCC ATAGAAAAGT CCTCAAGTCT TTGGCTAAAT AGAAAGCTGC ATATGCACAG 74 4 0 
GGAGAGGTTC CAGAGAGAAA ATAGGATAAA GAACAGCTAC TGGGGAAAGA AAAACTGCAG 7 500 
GGGAACAGTG AGCTCAATGG AGATGCCAGA GCTCACATAG CACTGGGGGA TATTTGAGTT 7560 
CTGACCAGCC TGAGGAGAGA CCTCGCTGAA CATCTTGGGC ATTCAGTAGT CACCACATAA 7620 
AGCCAAACTT TGGGAGTAGG ATTAGTGTAT TCCTATAATA AAGGCCACTC CAGAAACAGC 7680 
ATAGTAAAGC TGAAAAGCAA GTCTAAAAAA ATCAACACGA TCTCCAAGTA AATTAACTGA 774 0 
TTGCCAGAAG AAAATTCAAC CCTTTAGAGG CAAACAACAA AATCAAGTTG CTCAGTTATG 7800 
TGGCATCCAC AATGTGTGAC CTAAATTTAT AACTTTACCA GACATACAAA AAGCATTTAC 7860 
TGTGATCCAT AACCAGGAGA AAAAGCACTC AAAACAAATA AACCCCAAAA TGAAGAAATT 792 0 
GGCAAGAAGA TTTGAAATAT ATATATATCA TAATTGTGTT CAAGGATTTA AATAAAACAT 7980 
GAACATGGAA GAAACAAATG GATAATATCA AAAAAGAAAA ATTATAAAAT AACCAAATAG 804 0 
AAATTAAATA ACTAAAAAAG TGCATGTTTA ATGAAAAATG TACTGGCTAC CCTTACCATC 8100 
AGGTTAGACA TTACAGAAGA AAAAGTTAAC TAGAAAATAA TTCAATAGAA GTGATACAAA 8160 
CTGCAGCACA CACATACAAA GACTGAAAAG ATAAAGAAAC AGAGCCTCAA GAATATCTAT 8220 
GAAAATATCA AAAGATTTCA TATATGTGTA AAGCAAGTCA CAAGAGAGGA AAGAGATATT 8280 
GGGACAGAAA AAAATACTTG AAGCAACAAG AAAAATCTTA TTAGAAGCCA GAAGAAGAAA 834 0 
ATATATGTTT ACACAGAAGA ATAGTGGTAA AAATGACTGA TGCCTTCTCG TCAGAAACTA 84 00 
TGCTGGTCAG AAACAATGAA ATAACACCTT TAAAGTGATA GAAAAAAATA AAAAAGATTA 84 60 
ACATAGAATG TTATATCCAG CAAAAATATC CCTTGAAAGT GAATGTTATA TAAATACATA 8520 
TTCTGCCTCC CCCAAAATAA ATAAAACACT AAGAGAATAT TTCATTACTA GGCTTATATA 8580 
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AGTAAGCATC 


TCCAGTAATT 


TCTTAAGGTA 


GGCAGCGTTC 


5460 


ATTGCAGTCT 


TCAGCATTGC 


AGTTTCTGAG 


GAATGTGGCC 


CCTGATTCTG 


TCATCCTAGA 


5520 


GAAACCTGAC 


ATGACTGTAT 


TGATTCCATA 
AGACTGTATG 


TCATCCTGGG 


TCTCTGTGGC 


TCTTCATAAT 


5580 


CATCCATTTT 


TTCCCTGTAC 


TTTGGGAATG 


GGAAAGGATA 


CCGAGGCAAG 


5640 


AGGGCGACCA 


CTGTTACTGG 


GACGCCATGC 


CAGGACTGGG 


CTGCCCAGGA 


GCCCCATAGA 


5700 


CACAGCATTT 


TCACTCCAGA 


XII 

GACAAATCCA CGGGCGGGTC 


TGGAAAAAAA 


TGTAAGCCAC 


5760 


TTTGATTTGG 


ACTCTTTGGC 


CTTTTGCTCA 


CCAATCTTTG 


CAAACAGAAT 


TGGTTCTGTG 


5820 


TTACAGAAAA 


TCTGACCTGG 


ACTGCTCTTT 


TTTGTAATGG 


GGGAGAGGGG 


ACAGAAGAAA 


5880 


ATATTGGAAA 


GGCATCAGGG 


GGCTACGCTA 


GAATATAATT 


GGCCTTAGTA 


TGGAAAGTAC 


5940 


AAGCAGCACA 


GGCCAGGAAA 


CCTCCACACA 


TGTGAGGGTT 


CTCAGGCCTC 


TTCCCTTTAG 


6000 


TGACATTTCT 


TTAAAGTTTC 


CATTATTGGG 


GACTGTCTCT 


AGTTTCTAGT 


GTTTGTATGC 


6060 


TAGGTTCCAG 


TAATCAAAGA 


TGCCCTTTAT 


GAAATTTAAG 


TCAGATTTTT 


CGAGAAAAAA 


6120 


TTTGGATGGG 


CCATCAGGTC 


ACCATGGGAC 


TTCCCTTAGC 


CTCATCGATT 


CTCTGCGATG 


6180 


GTTTACTTTG 


GGGCCTATGA 


ATAGGGAAGA 


CTGAGATATA 


GGAAAAACCA 


AAGTGTCTGT 


6240 


GTTCCCCCAC 


TCTCACACCC 


ATGCAGCATA 


ACACTTCTCA 


CACCAGATGT 


GGGGGGATTT 


6300 
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AGTGCTGACA 


CTCTATCTGG 


AGACAGCGTC 


AGATCCCATA 


AGTTAAGGCT 


CAGTCCCACA 


6420 


AGACCGCCCC 


ACTGCAGATG 


CCAATCCCAA 


GTTCCAGGCG 


GTGACCTGTA 


CTTCTGCCCA 


6480 


ACTGGACAAA 


AATCTGTTTT 


TCTACTTGAT 


TACTTTGCTA 


GAGTGGCTCA 


CAGAACTCAG 


6540 


GGGAACACGT 


TACTTTTATT 


TACCCATTTG 


TTATAAAAGA 


TATTACAAAG 


GATCCTGGTG 


6600 


AACAGCCAGA 


CAGAAGAGAT 


GCACGGGGCA 


AGGCATGTGA 


GAAGGGGCTC 


AGAGTTTCCA 


6660 


TGCCCTCTCC 


AGTGCACCAG 


CCCCCGGTAC 


CCCAAGTGTT 


CAGGAACCCA 


GAAGCTCTCC 


6720 


AAGTGCAGTC 


TTGCTGGGTT 


TTTATGGAGG 


CTTCATTACA 


GAGGCACAGT 


TGAATACATC 


6780 


GTTGGCCATT 


GGAGACCAGC 


TCACCTTCAG 


CTCCTGTTCC 


CTCCCTGGAA 


GTTGGACGTG 


6840 
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GGGGGCTGAA 


CAGTTCCAAC 


CCTGCAATCA 


CATGGTTGGT 


TCCTTTGGCA 


ACCAGCCCCA 


6900 


TCCTGAGACT 


ATCCAAGAAC 


CCACCAAGAG 


TTGCTTCATT 


CAAACAAAAG 


ATGCTCCCTT 


6960 


CACTCAGGAA 


CCCCCAAGGG 


ATTTAGGAGC 


TCCGTGTCAG 


GAACTGGGGG 


GCAGAGACCA 


7020 


AATATACGTT 


TCTTATTCTA 


CCACAGTGTC 


ATATGAATGG 


GAGGACAACA 


CTGCCTTTCT 


7080 


GTGTCTTGCC 


CCATAGAGGG 


CGCACAATGC 


ATGGAAATAA 


ATGTTTCTGA 


ATCAACAGCA 


7140 


AACAGGCTTC 


ATCGGGTAGG 


AGAGCGCTGA 


GCCCTCCAGG 


GACAATGCAC 


ATCAATGATG 


7200 


TCCCACTGTC 


CTTTGGTGCT 


GGGGCTCTAA 


GGCCTCCACT 


GGGTCAGGCT 


CCTGAAGGGA 


7260 


GACCCATTCT 


CCAAAGACCC 


CCGAGGGTCA 


CCACTCCCTG 


TCCAGGGGTG 


TGGCCTCATA 


7320 


GCTCCTTTTG 


AACAGGGGCA 


CAGGAAGGAC 


GGCTTTAGAG 


CATTCAAAAA 


ATAACTTTGC 


7380 


CAAAATAATA 


ATAATAATAA 


TAGAAAGAAA 


GGAAGAAGAG 


GCTGAGCATG 


GTGGCTCACA 


7440 


CCTGTAATCC 


CTACACTTTG 


GGAGGCTGAG 


ACAAGCAGAT 


CACCTGAGGT 


CAGGAGTXCG 


7500 


AGACTAGCCT 


GGCCAAAATG 


GTGAAACCTC 


ATCTCTACTG 


AAAATACAAA 


AAAAAATTAG 


7560 


CCAGGTGTGG 


TGGCGTGCAC 


CTGCAGTTGC 


AGCTACTCAG 


GAGGCTGAAG 


CAGGAGAATC 


7620 


GCTTGAACCC 


AGGAGATGGA 


GGTTGCAGTG 


AGCTGAGATC 


ATGCCACTGC 


ACTCCAGCCT 


7680 


GGGCGACAAG 


AGCAAAACTC 


CACCTCAAAA 


AAAAAAAAAA 


AAAAAAAAAA 


AAAGAAGGAA 


7740 


GGAAAAAGAA 


ACACTCCTTT 


ATGTCTTCTA 


AGGATAGACA 


TGAAATGCGT 


GAGCCTTGGA 


7800 


ACACCTTCTC 


CCTCTCCTGC 


CCCACGTGAG 


CTGGAGCTTA 


CATGCCTTCT 


TGTTTTCAGX 


7860 


ACTGCCGTAA 


CCCTGATGGT 


GATGTAGGTG 


GTCCCTGGTG 


CTACACGACA 


AATCCAAGAA 


7920 


AACTTTACGA 


CTACTGTGAT 


XIII 

GTCCCTCAGT 


GTGGTAGGTT 


GCCTTCTTTT 


TGGTAAGGAA 


7980 


ACTGCTTACT 


TAATATGGAT 


TTGCAACAAA 


AAAGGAAAAG 


GGCTTCTGAG 


CAGACTGCTT 


8040 


CTGGGGAGGA 


GATAGCTGCC 


CTCXCCATCA 


GACCCCACTC 


TfcCATCATGG 


GCATCTTGAA 


8100 


TCTGCCCTAC 


TATTGGCCAC 


ATTTGTTAGA 


GGAACACCTG 


CCCATCGCCC 


CAGGCACACA 


8160 


TAAATAAAAT 


AAATGTAAAA 


TTCCCAAAGA 


GCAAGCTTAG 


AGGTAATCTA 


GTCAGCCCCA 


8220 


GGATGGTCCC 


ACTGAATGCT 


GCCATGTCTA 


GCGTGGGATG 


CATGAAAAAT 


TTAGAGTCAT 


8280 


TCGGATGAAA 


AACTTTCCCT 


TTCCACAGCT 


GAGAAGTAAG 


AAAGAAAATA 


CAAACAGCAG 


8340 


GAAACAGGTA 


AGCATGTAAC 


GCACATTGTA 


AACCTCAGAT 


GGCCATCCTA 


GGAATTCAAT 


8400 


GAAAGGTAGT 


GCAGCTCTTT 


AGCCCCAGAT 


GGCCTTTCTT 


ATAAGTTTAC 


TACTCACAAG 


8460 


TCACATTAGT 


GACATAGCTT 


AGAGACTGCT 


TGTTGGGTTC 


CATCCTCATT 


GCTCTGAGAC 


8520 


TCTTGTTGGG 


AGTATGAGGC 


TTGGATCAGG 


GGAAGGGGAG 


TTGACATTAG 


TTCTTAAAGA 


8580 
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ATTGGAATAA CAAATCCATG GGTATTTCTG AAAAAAAAAA AAAAAAAAGA AAGGAAGCTA 8640 
CTTGGAATTG TCCCATATTT AACATTCTGC TGACCAATCA ATTTGTCCTA GTTACAGAAA 8700 
ACCACCCTGG ACTTCTCCTA TGCATAATTT GGTTGCTTGT GGTTGGGTCT GCCATGTGGA 8760 
GGGACCTTGA GCTGGGGGAA GGAGCTTGGC CTCCAAGTCC ACTGAAGACC AGCATCCTGA 88 20 
GATTGCCTGG GAAGGTGGTA CAGGGCAGTG ATGAAGATCA TGGGAGCCAC ACTGCCCAGC 8880 
TTCGCATTTG GGCTTCTCCT AGGGACACCA AGAGGGAGGA AGGAGGGGTT AGGATGGTAT 894 0 
GAAAGATTCT ACTTGGCCAA TATTATTGTA ATGCGGCATT GTGATCTCTG GATTTAGCAT 9000 
GAGTTGATAG CTGACTTTTT CTGCAGAAGC ATCTTGGTGG CACCTCTAAC TCAAAGTCCC 9060 
TCGATGGAGT CAGTTCCAGT TCTCCACTTC TGGCCCCATC TGGTACACAC CACTGCCTCT 9120 
CACTGCCTGG GCTCTCTATC CTTGACAGGC TGCCTTGAAG TTGAGCCCAG ACTGATTTTC 9180 
TTGCCTCAGA CCCCACTACC GTGCCTGGGA CTCATGCACC TTTGACTCCC ATGGAAGGGA 9240 
AGTGCAGTAG TTTCCCAGGT GCAATTCTGG TGTCCTCACC CACATTGAGG ATGTACAAGA 9300 
ATCAGGTTCT TAGAGATTGG AGAAAGAAGG AAGAATGGGA ACAAGATTTT TCCCAAAGGA 93 60 
CTGTGAGGTC CCCCACCTAA CCTTGATGTG AGACAAGTGA GGTTAACCCC AAGCCTGGTG 94 20 
AGAAGCGTTC CCATCAGACA CTTGGAAATC CTGAGGACTG TTTCATGCAG AAGGATATGG 948 0 
TTTATTCAGG TTTGACTCAT GCTTGAGAAA GCTAGAGCCT CTGGTGGTGA ATGATTTTAA 954 0 
TAACTATTTC CTTTCCACCA ACATATACAG TACAAATAAT AATAAGCAAA AATAAATAGA 9600 
AACATTCAGT TTTGTTTTGA ATAGTAGGAG CAGGGTACCA TCATTTCTGT AGTTACTCTT 9660 
TTAGTACAAC GATGCATGTC TACTGTATGT AAGGCATACT AGCAGAAATT GAGCTCAGCA 97 20 
CTAGAGAAGA TGATTGCATT CTATGCCTTG CTTCTTTTTT AAAAAAAAGG CTTCCATAGA 9780 
TAGATTCTCA GAACAGCCCA TGGCAAATGT AAAGTTATTT GGAAAACCCA GGTTCCAGAT 984 0 
TCACTAGAGC ATAGAATCTC TGGTTGGTTG GGAAGGAATT TCCTCTTACA GTTGTTACTA 9900 
ATAATTGTAT GAACAATTAT TTAAAATATT AACATTTACA TTTGTGAAGA CCTTGAAGGG 9960 
CTGGAGACAA CAGAGAAGCA TTTTTGAATA CCCTCTGCAG 10000 
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CCCCTGCACT GTTGTAGGCA TTGGTGGATG GTACCAAAGA TGGGACACTG TCCCTACCTC 60 

CAGAGACCCT GTGGGCTGGC TACAGAGAGA AGGCAGGGAG GAGGAAAAGA AGAATAAAGT 120 

CATATGTTTA AGTCACCCCC ACGGCCGTTG GTTAGTCATG GGAGGCTCCC CAGAGGAGCT 180 

GTCCTGAAGC TGGCTGACAG AAGGCAACAT TTCAACTTAG GACAGTAATC CTTGCTACAT 24 0 

ACAATCACAT ACACACACAC ACACACGTGC ACACACAGAG ACTCACATGG AAAAATAAAC 300 

CTTTGTGCCT TTCAGCAGTG ATGACAATTA TGGTTTTCAG TAAACTTTAC ATGGTTTAGA 3 60 

TGGTGATGGT GATGATGATG ATTATGGGAA GGATGGCATC ATGTTCTAAA CATACTGCAT 4 20 

GGAGTCAGAA TAACAATGAC AAATAACCAT TTGTCCCAAT CAAGGTTTTC TCAGAAAATA 4 80 

TCTCATTCTG ATGCTAAACT ATACCAGTCT GTTTGATCAC TTCTCCAACA AAATAATTAC 54 0 

AAAGTGCTTA TATTTTCTTG AAAAGAGAGG GTCCTGTGTT GTCTACTACC ACTTTTGAAA 600 

CTTAGAGAAA ATGTTCCAAA AGATGATGAT TTTACTATTT AGTTCGGCCT TTAAGATGTC 660 

AAAAACTCAG TGCTTGGAAT TTGTCTCGAA TTACACCACA AAATTGCTAC CTTGTCTCAA 720 

ATGGGATTTC TTTCCCACCT TGTGCCACAG CGGCCCCTTC ATTTGATTGT GGGAAGCCTC 780 

AAGTGGAGCC GAAGAAATGT CCTGGAAGGG TTGTAGGGGG GTGTGTGGCC CACCCACATT 8 40 

XIV ^ 

CCTGGCCCTG GCAAGTCAGT CTTAGAACAA G GTAAGAACA GGCCCAGAAA CGATTTATAC 900 

TGTCCCTCCA CGTAAGCCCT GCAAAACCCT TCTACATTTA CATAAAATCC ACACAGCTGA 960 

GGCATCAGCA CCTGCCTCTA AGTTTTCTGA AGGAGGAAAA AAGCTACAAA AATTAATATA 1020 

TGTATATATA CATATATATT TTTATAGGTT CTCTACTGTG AAAATGACAA AAATTGCTGT 1080 

CTTTTTCTTG ATCTGGGCAG CTCCATCAAA ATCTGTAGGC ACAGTGATTT GCACCAAGTT 1140 

CCAATATTGC TGGAAAATAC TGAAGATGCT CTGAGGATTT CTATGGATAT CCATTGTCTC 1200 

ATTGTCAGAT GAAAAGAGGG GGAAGTTTTT AGAAATGTGA CACTTTCTGG GTTGGGAGAG 1260 

CAAGGACAAA ATTATCTCCA GTCTATCACA GGCACAGATT CTTTTTCTTT GGACACTTTC 1320 

GTGAATCATT GAATTCAATG CAGAGGCTAC TCATCCATTC GCAAACAAAA AAATTCTAGG 1380 

TCATGATCCC CATAAATGAA GAGTGATCAG TCCAATCCCA GGGAACCTGG ACATTTTGGG 14 4 0 

TATTGTTTCA GTGGAACATG CCTTTCATAA GTTCCATTTT CTTGGGTATC TCTTAGGAAG 1500 

CAAGCATAGG AAACAGGCCC ATCCGTCTGC CTGTTTTGCT TCCTCATCTC ACTTCTACAC 1560 

GAGGGCGCCT GTGCTCAATT GCTGTTTTCC CCTAAAGAGA CTCTTTTCCA TAAGTTTGTG 1620 

F3&2H 
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AAATGCCATC 


GACAAACCTG 


ATCGCATTGC 


ATTTCACTCT 


GCTGTTGAGT 


CGATTTTTCT 


1680 


TTATTTTATC 


ATTTAGTAAC 


TCCTTGCTCT 


ACAGAGCTTT 


CACCTTCCAC 


ATATTTCAGA 


1740 


TTCATTCTTT 


CCTAAACTAT 


GTGGTGGTCT 


ACGTCCTCAC 


TGACTTATCA 


ACATGCTACC 


1800 


ATCATGCACT 


TCCTATCTCT 


ATTCCTCTTC 


ATTAAAATCT 


GGTTCCAAGT 


GGCTCACACC 


1860 


ATTATTCTGA 


GCTATTACCT 


GCCTACGCAG 


TCCTAGAAAG 


TAAGTGATTC 


AGGAAACATT 


1920 


CCCCAAAAGT 


AAAGTTTCTC 


AGGTAAGATC 


AGAAGACTCC 


CATGAGTCAC 


TGCTGCTCAG 


1980 


GATCACATCT 


GGCTCCTTGA 


AGAGTGATTC 


ATCAGACCTT 


ACATAGATCT 


TGTCATAAAA 


2040 


ATGAAAGAGG 


CCTCGGGGGA 


AGGTCTTGGG 


CTGGTGGCTT 


CTGTTGGAGT 


CCTGGGCTGT 


2100 


GGGGTGAAAG 


CCGTGGCTGT 


AGAGCTTCAT 


GCGGAGTTAC 


TTAGCTTTGC 


TCTCCTGTGG 


2160 


ACAGGCCATG 


CTGTGCCTCC 


CCCAAGCATC 


GGAAAAATTG 


GCATAGATGG 


GCCCTTCTCA 


2220 


AAAATCCCAC 


TCCTGGAGCA 


CTGGCCAAAA 


TTACTACCAT 


CCTGATGCTG 


GGCTTGCAGT 


2280 


CCTTTCCTTT 


GGGAATATGA 


ACATGGTCAA 


AATTAAGTGA 


ACGTGTCTTT 


CTGGCTTTCT 


2340 


GTACAATGGA 


GCAGAACAAA 


GTATCAATXT 


AACTAAAATT 


TGAACTAAAT 


"CCTCTTTCCA 


2400 


GGTTTGGAAT 


GCACTTCTGT 


GGAGGCACCT 


TGATATCCCC 


AGAGTGGGTG 


TTGACTGCTG 


2460 


CCCACTGCTT 


GGAGAAGTAT 


GTTTAGGGGA 


XV 

CAATTGACAT 


GAAGTCTTGT 


CTTAAATACT 


2520 


TTTTCTGTCC 


TTCTTTTCCT 


CCTTTCCTCC 


TTTCCTTTCT 


CACTCTTCCT 


CCCTTCCTTC 


2580 


TCTGGCTGTG 


ACACTAGGGA 


CCAGGCCAGG 


GCAATTGGAT 


AAGAGAGAAG 


GGAAGGGTTT 


2640 


CTAGAAAGAA 


ACTGCAGAGG 


AAAGACACAG 


TACAGATGAT 


TTTGTGGGCC 


TGAATAAACT 


2700 


GCAGAACAGA 


GCTGTTCACT 


ACCATAGGCT 


GTATCAGTCT 


CTGCCCAAAC 


AGCCCAAGAA 


2760 


CATTCCTTAA 


CTGCCTGTTT 


CAAGCAAATC ATGAATTTTG 


CTTCTTGCCA 


' CTCAGAAGTC 


2820 


ACTAATTCTG 


AGTGGCCAAG 


GGTGTCAGGG 


AGACAGCACC 


AATTTCATGG 


CACAGAGGTT 


2880 


ACCTGAAGGG 


GCTGGAtCAT ATTTTCCTCT TGACATCCTC ATCTTTTCTA 


GGTCCCCAAG 


2940 


GCCTTCATCC 


TACAAGGTCA 


TCCTGGGTGC 


ACACCAAGAA 


GTGAATCTCG 


AACCGCATGT 


3000 


TCAGGAAATA 


GAAGTGTCTA 


GGCTGTTCTT 


JCVI 

GGAGCCCACA 


CGAAAAGATA 


TTGCCTTGCT 


3060 


AAAGCTAAGC 


AGGTACTCGT 


TCACCTGTGG 


TCTTCACCCC 


ACGCTGGTGA 


AGATATTTGC 


3120 


TTTATGTCTG 


GGTTTTATGG 


GCCATGGCAC 


TGCATGGCAG 


TGGGAGGAAC 


TGTCTATCAC 


3180 


ATGAAAGGCT 


CAAGGGCTTT 


GGGGACAGCA 


TCAATCTTCA 


ACCCTAGCCC 


TGCCACATGC 


3240 


TAGCTGTGCT 


CTTGAGAAAG 


GCAGCAGGAC 


TCCGTTTTCT 


CATGTGGAAA 


AAGAGTTGAA 


3300 


ATGAGGTACT 


CTGTTACTCC 


TAGAACTCAC 


TTAATGTTCA 


CCAGTTCATA 


CACATTCATG 


3360 
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ATCAGAGAAC 


GATTCAGTTA 


TTCCAGGCTG 


ACAATTCCCC 


CTTCATCATA 


ATATGTTTAA 


3420 


GAGAATCATA 


TAAGACTATA 


TTTGTTTCAA 


AGCACTTTAA 


AAACCACAAG 


ATCGAGTTGG 


3480 


GTGTCTGGTG 


TGGGTGCCTG 


TAATCCCAGC 


TACTTGGGAG 


GCTGAGGCAG 


GAGGGTCACT 


3540 


TGAGTCCCGG 


AGTTTGAGGC 


TGCAGTGAGT 


TATGATCGTG 


TCACTGCATT 


CCAGCCTGGG 


3600 


CGACAGAGTA 


AGACACTGTA 


CCAAAAAAAA 


AAACACCAAA 


AAAACAAAAA 


ACAAACAAAA 


3660 


AAAAAACAAC 


TTCACAATGT 


CAAAAAAATC 


ACAAATACAG 


TTTATAAATG 


TAAATTATAT 


3720 


TATTATTATT 


GTCTTCTTTG 


ATTTGATTTT 


CTCTTTCCTG 


TTGAAATGTT 


GTTTCACTAA 


3780 


GCCTGACAAA 


GTGAAACATT 


TGCTTATGTC 


ACTCATTTAG 


TGCTGTTTGG 


AGCCAGATAC 


3840 


TAGTTGAGTC 


AGCTAAGAAA 


CAGCTATTTG 


TAGGAGAAGC 


AGGTTTGGGA 


CAGGTGACAA 


3900 


GGCACGCAGG 


GCGCTCGCTG 


TGCTGGTGGT 


TCTGGAAGAC 


AGGGTGTCAG 


TGTGGACAGG 


3960 


GATGAGCATG 


GCCTGGATGA 


GAAGGCACGG 


GGCAGGAGCC 


TGAGCTGCTC 


TCCTGGGCCT 


4020 


GGCCACAAGC 


CCAGGGCAGC 


TTCTCTGGGT 


CTGTGAACTG 


AGGGGTGATG 


TCCTGGGATG 


4080 


CTCTGACACT 


CTAGAAGGAG 


AGAAGAGCCT 


TTCCAGCTCA 


GCCTTTATAA 


ACAGTAGCTG 


4140 


ATCTCCCTCC 


TGCTCCCCAG 


TGTCCTCCCC 


GCCATCCCAG 


CAAATGTGCA 


AATAGAAGGT 


4200 


CCCCGTTCCT 


CATGATCCTC 


AGAGAGCTGG 


GGTGTTCTGA 


TGGCTTGAAC 


AAGTAATTTG 


4260 


GAAATTTTGG 


GTTTTGGAGG 


AGTTCTCTGA 


TAGGCTGATA 


CATTTCGAGT 


TTAGAGTTCC 


4320 


CACCCCACAT 


CCCCACACCC 


CGAGTCTAGG 


GCATTTAGTG 


CTCCACCAGG 


GAACCTGTAG 


4380 


AGTGAGGAAG 


TCTGCATGAC 


AGGCTGGGCC 


TTCTGATGAT 


GCTCAGAAGC 


AGAAAGTGTG 


4 44 0 


CCTGCTTCAA 


AGTTGGTGAC 


GATGATGTTT 


CTTGATCAGA 


ATAGGGCATT 


TCTTATTTCC 


4500 


AATCCTTTAT 


CCTCTTGAAC 


TTACTAAAGT 


AGAATCAGGT 


CTAAAAACCG 


GAGTTCTAAT 


4560 


GTTTGAGAGT 


CCCTGGGACT 


CTAAAGTATA 


TGAATGTTCT 


TTGAAAACAA 


ATACCATTTT 


4620 


GTTCAAGCAA 


AAGGCTTATT 


TCCAATCCTC 


TTTCATTTGG 


TATCAAGTAT 


TTTACTGGAT 


4680 


TCTTACAACT 


ATGGCGTAGT 


AACATTCACT 


GAGGAGGAAA 


TGGAGGATCC 


AAGGATGGAG 


4740 


CAAGTTGCTC 


TGGGCACACA 


ACACATTTGC 


AATTTTACAG 


CCTCTTGGTG 


GCATCTCAGT 


4800 


CAGACATTCC 


ATGCACTGAT 


CAATGCCCTA 


TTCGATTAAT 


GTAAAAGGAC 


ACACTCAGCA 


4860 


TGAGATTCCA 


GTTGTGCACA 


GAATACTACA 


TGAGAAGTGC 


GCCTTTGTCA 


TCCCTACTTT 


4920 


CAAAGGTGAA 


GGCCACCAGC 


AGTATCTTGC 


ATGCAACTGA 


TGCCTTTCAA 


ATGAAACCTT 


4980 


ACATCTGCAT 


AGTCCATAGA 


CAACCACAGG 


CAAATGTGAG 


GGTGAAACTC 


TGTGTTCTAC 


5040 


GTTGCTCTGT 


GTCAGTGAAG 


CAAGGCAGTG 


CAGXTCAGAG 


GGCTCTGGGG 


CCTCAAGACA 


5100 
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GGGATGACTG GTTGTGGGTA CTGCAGCTGC GAGCAGAGCA GTCAAACATA ACTGCTGATG 5160 

CTTTTCTTTC AG TCCTGCCG TCATCACTGA CAAAGTAATC CCAGCTTGTC TGCCATCCCC 5220 

XVII 4r 
AAATTATGTG GTCGCTGACC GGACCGAATG TTTCATCACT GGCTGGGGAG AAACCCAAGG 5280 

TGAGATAAAT TCCATTGCCC ACATAACGAA TTGGTTTTGA CCTACAGTCC ATGTGACAAA 534 0 

ATGATCATTT TGGAGAAAGC TGTGCAAATT CCTATCCATG AATGTGGTCC ACCCCACTCC 54 00 

TGATTTTGCC TGGGCACCTG TCTATGTCTT AATCAGTCTT CAAGGCACAT GATCAAAGGG 54 60 

AGGAAAACTG TGTCTTTGAG TCTCTCTCTC TCTCTCTGTT TTCAGAACAT TTTTATTTCA 5520 

ATTAATTAAT TTTTAACTTT TATTTTAGGT TCAGGGGTAC ATGTGCAAGT TTCTTGTATA 5580 

TGTAAACAGT GGTTTGTCAT GCAGATTATT TTGTCACCTA GGTACTAACC CTAGTACCCA 564 0 

ATTCTTAGTA TTTCCTGCTC CTCTCCCTCC TCCCACTCTT CTCCCTCAAG TAGGCCCCAG 5700 

TGTCTGTTGC TCTCTTCTTT GTGTCCATGA GTTCTCATCA CTTAGCTCCC ACTTATAACT 5760 

GTGAACATGT GGTATTTGGT TTTCTGTTCC TGTGTTAGTT TTCTAAGAAT AACGGCCTCC 5820 

AGCTCCATTC ATGTTCCTGT AAAAGATATT ACCTCATTCT TTCTTATGGC TAAACAGTAT 588 0 

TCCATGGTGT ATATGTACCA CATTTTCTTC ATCCAATGTG TCATTGATGG TCATATAGGT 594 0 

GATTCCATGT CTTTGCTACT GTGAATAGTG CTGCAATGAA CATTCATGTG CATGTGTCTT 6000 

TAGGGTAGAA TGATTTATAT TCCTCTAGGT ATATCGCCAG TAGTAGGATT GCTGGGTTGA 6060 

AAGTTAGTTC TGCTTTTAGC TCTTTGAGAA TCACCATACT GCTTTCTACA GTGGATGAAC 6120 

TAATTTACAG TCCCACCAGC TGTTAGTGTT CTCTTTTCTC TGCAACCTTG CCAGCATCTG 6180 

TTATTTTTTG ACTTTTTAGG AAGCCATTCT GGCTGGTGTG AGATGATTTT TCATTGTGGT 62 4 0 

TTTGATTTGC ATTTCTCTAA CGATCAGTGA TATTGAGCTT TTTTTCATAT GTTTGTTGGC 6300 

CACAGGCATG TCTTCTTTAG AAAAGTGTGT TAGTGTCCCC TGTCCATTTT TTAATGGGGT 63 60 

TTTTTTTTTC TTGTAAATTT GTTTAAGTTC CTCATAGATG CTGGATATTA GACCTTTTTC 6420 

AGGTGCATAG TTTGCAAATA TTTTCTCCTG TTCTCTAGGT TTTCCCTTTA CTCCCTTGAG 64 80 

AGTTTCTTTT TCTGTCCAGA AGCTCTTAAG TTTAATTAGA TCCCATTTGT CAATTTTTGC 6540 

CTTTGTTGAG ATTGCTTTTG GCATCTTCAT GAAATTTTTG CCCGTTCCTA TGTCCAGGAT 6600 

GGTGTTACCT AGGTTGTCTT CCAGGATTTT TGTACTTTTG GATTTTACAT TTAAGTCTTT 6660 

AATCCATCTT GAGTTGATTT CTGTATATGG TGTAAGGAAA GGGGTCCAGT TTCCATCTTC 6720 

TACATATGGC TAGCCAGTTA CCCCAGCACC ATTTATTGAA TAGGGAGTTA TTTTCCCATT 67 80 

GGCTTGTTTT TGTCAGCTTT GTTAAAAATC AGATGTCTGT AGGTGTGTGG CCTTATTTCT 684 0 
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GGGCTCTCTA 


TTCTGTTCCA 


CTGGTCTACG 


TGTCTTTTTT 


TTTTTTTTTT 


TACCAGTACC 


6900 


ATGCTGTTTT 


TGTTACTGTA 


GCCCTGAAGT 


ATAGTTTGAA 


GCCAGGTAAT 


GTGATGTCTC 


6960 


CAGCTTTGTT 


CTTTTTGTTT 


AGGATTGCCT 


TGGCTATTCT 


GGCTCCTTTT 


TGGTTATATA 


7020 


TAAATTTTTG 


AAGTAGTTTT 


TTAATAGTGC 


TGTGAAGAAT 


ATCATTGGCA 


GTTTGATAGG 


7080 


AATAGCAATG 


AATCTGTAAA 


TTACTTTGGG 


CAGTATGGCC 


ATTTTAATGA 


TATTGATTCT 


7140 


TCCAATCCAT 


GAGCATGGGA 


TGTTTTTCCA 


TTCATTTGTG 


TCATCTCTGA 


TTTCTTTGAG 


7200 


CAGTGTTTTG 


TAATTCTTAT 


TGTAGAGATC 


TTTACCTCTC 


TGGTTAGCTG 


TATTCTTACA 


7260 


TATTTTATTC 


TTTTTGTGGC 


ATTTGTGAAT 


GGGACTGTGT 


TCCTGATTTG 


CCTCTGGGCT 


7320 


TGGCTGTTGT 


TGGTGTAAAG 


GGATGCTAGT 


GATTTTTGTA 


CATTGATTTT 


ATATCCTGAA 


7380 


ACTTTGCTGG 


AGTTGATTAT 


CAGCTGAAGG 


AGCTTTTGGG 


CTGAGACTAT 


GGGGTTTTCT 


7440 


AGACATAGAG 


TCATGTCATC 


TGCCAACAGG 


GATCGTTTGA 


TTTCCTCTCT 


TCCTATCTGG 


7500 


ATGCCCTTTA 


TTTCTTTCTC 


TTGCCTGATT 


GCTCTGACCA 


GGGCTTCCAA 


TACTATGTTG 


.7560 


AATAGGAGTG 


GTGAAAGAGG 


GCATCCTTAT 


CTTGTGCCAG 


TTTTCAAGGG 


GAATGCTTCC 


7620 


AGCTTTTGCC 


CATTTAGTAT 


GATGTTGGCT 


GTGGACTTGT 


CATAGCTGTC 


TCTTATTATT 


7680 


TTGAGATATA 


TTCCTTCAGT 


ACCTAGTTTA 


TTGAGAGTTT 


TCAATATAAA 


GGATGGTAAA 


7740 


TTTTATCAAA 


ATCCTTTTCT 


GCATCTATTG 


AGATAATCAT 


GTGGGTTTTC 


TCTTTAGTTA 


7800 


TATTTATGTG 


ATGAATCACA 


TTTATTGATT 


TATGTATGTT 


GAACCAAGCT 


TACATTCTGG 


7860 


GGATAAAGCC 


TACTTGATCA 


CGATGGATTG 


GCTTTTTTAT 


GTGCTGCTGG 


ATTTGGTTTG 


7920 


CAAGTATTTT 


GTAAAGGATT 


TTTGCATCAG 


TGTTCATCAA 


GGATATTGGC 


CTGAAGTTTT 


7980 


TTGTTGTTTT 


TGTGTCTCTG 


CCAGGTTTTG 


GTATCAGGAT 


GATGCTGACC 


TCATAGAATG 


8040 


AATTGGAGAG 


GAGACCCTCC 


TCCTCAGTTT 


TTTTGAACGG 


TTTCAGTAGG 


AATGGTCATA 


8100 


GCTCTTCTTT 


GTACATCTGG 


TGGAATTCAG 


CTGTGAATCT 


ATCTGGTCCT 


GGGCTTTTGT 


8160 


TGGTTAGTAG 


GCTATTTATT 


ACTGATTCAA 


TTTTGGAGCT 


CATTATTGTT 


CTGTTCAGGG 


8220 


AATCAATTTC 


TTCCTGGTXC 


AGTCTTGGGA 


GGGTGTATGT 


GTCCAGGAAT 


TTATCCATCT 


8280 


CTTTTAGGTT 


TTCTAGTTTG 


TGTGCATGGA 


GCTGTTTGTA 


GTAGTTTCTG 


ATGGTTATTT 


8340 
• 


TTATTTTTGT 


GGCATCAGTG 


CTAACATCCC 


CTTTGTCATT 


TCTAATTGTG 


TTTATTTTGG 


8400 


TCTTATCTTC 


CTTTTCTTCA 


TTAGCCTAGC 


TAGCAGCCTA 


CCTATCTTAT 


TACTGTTTTC 


8460 


AAAAAACCAA 


CTACTGGACT 


TGTTGATCTT 


TTGAATGAAT 


TTTCATGTCT 


TGACTTTCTT 


8520 


CAGTTCAGCT 


CTGATTTTGG 


TTATTTCTTG 


CCATCTGCTA 


GCTTTGGGGT 


TGATTTGCTC 


8580 
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TTGTTTCTCT AATTTTTTCC ATTGTGATGT 
ATGCTAGCAT TTGGTGCTAT GAATTTCTCT 
TTCTGGTATG TTGTATCTTT ATTCTCATTA 
TTTCATTATT CACCCAAAAG TCATTCAGGA 
GGTTTTGAGT TATTTTCTTA GTCTTGACTG 
GTTTGGTATG ATTTTGGTTC TTTGGCACTT 
GTTGATTTTT AGAGTATGTG CCACATGGTG 
AGAGAGTTGT GTAGAGGTCT ATCAGATCCA 
TATCTTTGTT AATTTTGTGC CTCGATGATC 
CTCCCACTAT TATTTTGTGG GCGTCTAAGT 
GCTGGGTGCT CTTGTGTTGG GTTCACATGT 
GAACCCTTTA CCCCTTTACC GTTATGTAAT 
GTTTAAAGTC TGTTTTGTCT GAAATTAGGA 
ATTTGCTTGG TAGGTTCTCC TCCATCCCTT 
GAGATGGGTC TCTTGAAGGT AGCATACCAG 
CTGTGCCTCT TAAGTTGGGC ATTTAGCCCA 
GAATTTGATG CCCTCATTGT GTTGTTATGC 
CATTGGTCTG CGTATTTAAG TATATTTTTG 
GTGCTTCTTT CAAGATCTCT TGTAAGGCAG 
GCTTAGCTGA AAATGATCTT ATTTCTCTGT 
AAATTCTTGG GTGGATATTT TTTAAGAATA 
GTACGGGTTC AGTTGAGAGG TATGCTGTTA 
GTCCTTTCTC TCTAGCTGCC TTTAACATTC 
TGATTATGTG TCTTGAGGAT GATCTTCTTG 



TAGGTTCTTA 


ATTTGAGATC 


TTTCTTCTTG 


8640 


CTTAACACTA 


CCTTAGCTCT 


GTCCAAGAGA 


8700 


GTTCAAAGAA 


CTTCCTGATT 


TCTGCCATAA 


8760 


GCATGTTGTT 


TGATTTCCAT 


GTAATTGTAC 


8820 


GTATTTCATT 


GTGCTGTGGT 


CTGAGAGTGT 


8880 


GCTGAAGATT 


GTTTTATGTC 


CAATTATGTG 


8940 


ATGAAAATGT 


ACATTCAGTT 


GTTTTGGGAA 


9000 


TTTGGTCCAA 


TGCTGAGTTC 


AGGTCCTGAA 


9060 


TGTCTAATAC 


TGTCAGTGGA 


GTACTGAAGT 


9120 


CTCTTTGTAG 


GTCTCTAAGA 


ACTTTATGAA 


9180 


ATTTAGGATA 


GTAGATCTTC 


TTTTTGAATT 


9240 


GCCCTTCTTT 


GTCTTTTTTG 


GTCTTTGTTG 


9300 


TGGCAACCCT 


TGCTTTTTTG 


TCTGATTTCC 


9360 


TATTCTGAGC 


CTATGGGTGT 


CATTACATGT 


9420 


TGGGTCTTGC 


TTTTTATCCA 


GCTTGCCACT 


9480 


TTTACATTCA 


AGGTTAGTAT 


TGCTATGTGT 


9540 


TGGCTTGTTT 


GTGTGATGGT 


TTTATAGTGT 


9600 


TATTGGCTGG 


TAGCCATCTT 


GCTATAGTTA 


9660 


TTCTGGTGGT 


AACCAACTCC 


CTCAACATTT 


9720 


TGCTTAGGAA 


GCTTAGTTTG 


GCTGGATATG 


9780 


TTGAATATAG 


GCCCCAATAT 


CTTCTAGCTT 


984 0 


GATTGATGGG 


GTTCCCTTTG 


TAGACGACCT 


9900 


TGTCTTTCAT 


TTTGACCTTG 


GAAAATCTGA 


9960 


TATAGAATCT 
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CACAGGGGTT CTCTGTATTT TCTAAATTTG ACTATTGGCC TCTCTAGCAA GGTTGAAGAA 60 
GTTTTCATGG ACAATATCCT GAAATGTTTT CTAAATTGTT TACTTTCTCC CCATCCCTTT 12 0 
CAGAAATGCC AGTGATTTGT AGATTTGGCC TTTTTACATA ATCCCATGTT TCTTGGAGGC 18 0 
TTTGTTCATT CCTTTTCATT CTTTTTTCTT AATTTTTGTC AACTGTCTTA TTTCAGAAAG 24 0 
CCAGTCTTCC ATTTCTGAGA TTCTTTCCTC AGCTTGGTTT ATTTTGCTAT TAATACTTGG 300 
ATTGCTTTGT GAAATTCTTA CAGTTTGTTT CTCAGCTCTC AGCTCTGTCA GATCCATTAG 360 
GTTCTTTTTT AAACCAGTGA TTTTGTCTTT CAGCTTCTAT ATCATTTTAT TGTGATCCTC 4 20 
AATTTCCTTG GATTGGATTT TGCCATCCTC CtGGATCTTG ATGATCTTCA TTCCTATCCA 4 80 
TAGTCTGAAT TCCAGTTCTA TCATTTCAGC CAGCTCAGCC TTGTTAAGAA CCCTTGTTAG 540 
AGAACTAGTG TGGTTGTTTG GAGGACATAT GGCACTCCGG CCTTTATGTT CCTTTAACTG 600 
CAGTGTAGGT TGAATACAGC CAATAGACTT GOTCTTTGGA TGTTTTTACA GGGCCAAAGC 660 
CTTGTGCAGG GTCTTTATTT GTAGTTGATT TCTTGTCTTT GGTTTCATAG TGTGGTATGT 720 
TAGCAAGGTA TTTTTGGTGT TGAAGCTTTG GGGTGTGATC CATTTTTTAT TTGTATATTT 780 
CCCTACACCT AAAACAAGCA AAAAAACAGT AAAGGTCTTT GAGTCTCTTA ATCCATAATT 840 
TCAGCATTCC TGAGTATGCT TCCCTGGGTA AGTGGGGTTT TCACCCAGCC CTCAAGTTAA 900 
GAGTGTTAGA TTATTTTTCA TGTGAAATTA GCCAGACTGG CTTTCTTAAC ACAATGTAAA 960 
ACAATAACAA CAAAAGTTAT AATTAGACTA GTCTTCTTCC CAAATACCCA CATGTCTAAT 1020 
GTAAGTGGGA TGGTGTTAAA CAGGGGACCT ACAACTGGGG GAGAGGCGGA CAGGTCCCAT 1080 
GGCCCCAGGT CTAGGATGGC ATXTGGTATT GGTTGATGGG TGTGGATGTG AACAAGAGAG 114 0 
GGAACACTTG TGCAGGATAT GGTATCAGCA CCTGTAATAC ATTTTAGGGA TTCTTTCTTC 1200 
TCTTTGCAGT ATGCCCTGAC AATAATTATA TCCATCAGCC TAGTCCCCTT GGCCATTGAA 1260 
ACACTAAGAC TGTCTTAGGA TCCCTGCTGC AGTTTCTCAG AGGTGCTAGG AGGGCATTAG 1320 
GAGTCTGAAG CCCTGGAAGT GTGTTCTGAC TTTGCCACTA GCTAGATAGA CCTGGACTAG 1380 
GCACGTTACC TCTTTGTACC ACTCAGCTCT AACCCCTCAT TCAAAAACCC AGCATTTTCA 144 0 
AGTGGTGTTT TTCACATCAG CCTTTGCATA AGTTTTCATT TGAAGAAAGG TTTTTTTGTT 1500 
TTTGTTTTCT TGGTTTAATC AAACATTTAA AAACGAATGG TCTAGATGAT TTCAAAGTGG 1560 
CTTTCCTTTT CCTGTGCTTT TCCTACTATT TAAAAACTTC ACCTCCTTGA TTTCTTGATC 1620 
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TCCCTTTCTG 


CACTGCTGGG 


TCTGGGAGCA 


TTGAGGCCAA 


GTAAAAGGAA 


CCTTGGCAAA 


X680 


GGAGGAACAC 


CTATGGGTGT 


GCCAGGCTGC 


TCCCAGTGTT 


TTGCATTTTT 


AAAAATTTAA 


1740 


ATGCTGCAAA 


CCTCTATGAA 


TTACATATTA 


TTGTTCCTAG 


TTTACAAATT 


AGGAGCCTGA 


1800 


GGCTCAGAGA 


ATGTGTGGGA 


TGGTACAGAC 


TAACCTGAAT 


TAGAACCCTG 


GCTCCCATTT 


1860 


ACTGGCTGTC 


AGGACTTAGA 


AAAGTCATAA 


ACTCTCTGGC 


TGGGTGCAGT 


GGCTCACGCC 


1920 


TGTAATCCCA 


GCACTTTGGG 


AGGCCGAGGC 


AGGCAGACCA 


CGAGGTCAGG 


AGCTTGAGAC 


1980 


G AG C CTG A CC 


AACACGGTGA 


AACCCCGTCT 


CTACTAAAAA 


TACAAAAATT 


AGCCGGGTGT 


2040 


GGTAGCACAC 


CCCTGTAATC 


CCAGCTACTC 


AGGAGGCTGA 


GGCAGGAGAA 


TCGCTTCAAC 


2100 


CTGGGAGGTG 


GAGGTTGCAG 


TGAGCCAAGA 


TTGTGCCCAC 


TGCACTCCAG 


CCTGGGTGAC 


2160 


AGAGTGAGAC 


TCTATGTGAG 


AAAGAAAGAA 


AGAAGGAAAG 


AAGGAAAGAA 


GGAAGAAAAG 


2220 


AAAGAGAAAG 


AAAGAAAGAA 


AGAAAGAAAG 


AAANNNNNNN 


NNNNNGAAAG 


AAAGGGAAAG 


2280 


AAAGAGAACG 


AAAGAAAGAA 


GGGAGGGAGG 


GAGGGAGGGA 


GGGAGGGAGG 


GAGGGAGGAA 


2340 


GGGTGGGTGG 


GTTGTGAACT 


CTTGTTGATT 


GTTTCCTCAG 


CTGAAATGTG 


GGCTGCAGGG 


2400 


CTATTGGGGG 


AGAAACAATA 


AGAAAGTGCA 


CCAAGCACCA 


AGCACATGCT 


AAGAAGTCCA 


2460 


TCATGGCAGC 


TCCTGATAAT 


AATATGGAAT 


AGAGTTGTAT 


CTAACATGAC 


TCTTTCTTGC 


2520 


AAGTGACAGA 


AAATGCAACT 


TAAGTTGGAT 


TAAGCAAAAA 


AGAGAAATCA 


TTAGTGAACT 


2580 


GAAAATTCTG 


CAGGCTCACA 


TCATGGCCCC 


AGACCCTGTC 


CATTATTCTT 


GGGCACAAAT 


2640 


GTGACATTCT 


CGTGGCTGCA 


GATGCTGTGG 


TGGCTCTGGC 


TCTGCAGGAA 


AAGAAATAAG 


2700 


GAAGGCCACT 


CTCCCCATTA 


CACAAACAAC 


AGTCTTCCAG 


CTCTGAGAGG 


TCGAACTTGT 


2760 


GTCACCAGCC 


TGCCCCTAAA 


CCCGTCACTG 


ATTAACTCCA 


ACCTGCATCA 


GCTGTTCCAT 


2820 


GCTGGAGGTG 


GACGCAGGAC 


CACACTCATA 


CCAAGATGGG 


GGCAAAGTGT 


AGTTCCCTCA 


2880 


ACAGGATTAT 


AGGATATAGT 


GTGATAGGCT 


GCTGGGCCAG 


AAAAGCAAAC 


AGATCCTCTA 


2940 


CAATTCCTCA 


ACTGATGAAA 


GCACGAAGCT 


AAAATCATAA 


AGATCTGTGT 


GTGAGTTCTG 


3000 


GCTCTCCCAT 


CTTCCTTGTG 


AGATTGAGCA 


GTTAGTTAAT 


CTCTTTTAGC 


CTCAGCTTTC 


3060 


TCACCTGTAC 


CAACATATAA 


GGTCATTGTG 


AGGATTAAGA 


TTATGCCTCA 


TGATCATCAT 


3120 


TATCATCATC 


ACCATCCACA 


TTGCAACCAC 


AACTACCATC 


ATCATCCCCA 


CCAACATCAT 


3180 


CACCACCACC 


ACCATCACAA 


TTATCATTAC 


CACCACCACC 


ATTGTCACCC 


TCAACATCAC 


3240 


CATCATCACT 


ATCACCACCA 


CCATCATCAT 


CACTACCACT 


ACCAACACCA 


TCACTCTCAT 


3300 


CATTCCACCA 


CCATCACCAT 


TAACATTACC 


ATCACTATCA 


TCACCACCAC 


CACCACCACC 


3360 
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ACCCCCATCA 


TTACTGCCAT 


CAACATCACC 


ATCACCATCA 


TCACCACCAT 


CACCATCATT 


3420 


ATCAACCATC 


ATCACCACCA 


TTCCAGCACC 


ATCACCATTA 


TCATCACTAC 


CATTATTCCA 


3480 


CCACCATCAT 


CATCCACCAC 


CACTACCACC 


ACCATCACCA 


CCATCATCAC 


CATAACCATC 


3540 


ATCACCACTA 


TCAACATGAT 


AGTAATTATG 


ATTACCACCA 


CCATTAGCAT 


TATCATTACC 


3600 


ACCACCAGTA 


CCATCACCAT 


CACCACCGCC 


ACCACCTCCA 


TGATCATTAC 


TACCCACCAC 


3660 


CATCACCGTC 


ACCATCATTT 


CACTACCAGC 


ACAATTATCA 


TTACCACCAC 


CATCACTACC 


3720 


ACCCTTATCA 


CAACCCTCAT 


CATCACCACC 


ATTCACCAGT 


GCACCACCAC 


CACCACCATC 


3780 


ACTATCATTA 


ACAATAGACA 


TCACATAACC 


AGTTTGTAGC 


TGGACCTTGA 


GCCCAGAGCC 


3840 


CACTCACTGT 


TTCTTCAGTC 


CCACCGCCAA 


CCACCAGGAT 


GAGTCACAAA 


ACATAACTCA 


3900 


GGCCTGCTCC 


TCAATTTTCT 


ACATGTCAAT 


AATGACATTG 


AAGCAATGGG 


TGTTCTCTGC 


3960 


TTCTCAGAGG 


GAAGTTGAAA 


TTCTCCTGCT 


CTTCCCTTCA 


TGTTTCCAGA 


TGTTCCCTGA 


4020 


CTTGGATATT 


CCAAACGCAG 


AGTTTGGAGG 


TGTTGAGGCC 


AAGGGGTTTT 


TCCAGGTCAG 


4080 


CCATCATCTG 


CAATCACTGA 


GCTGATCCTG 


CTGCTGGACT 


TTCCCTGTTG 


CCCTCTCCCC 


4140 


AACGCCCATC 


GGGGAGGGCT 


TCAATCCTCA 


GGTCACCTGT 


GGCCTTTCTG 


CCCTCAGAGG 


4 2 00 


TGCCATCTCT 


ACATCTACCA 


CTGGAAGGCA 


GCACCTACTC 


ACAGATTGCA 


TCAATTTCCC 


4260 


AGCAACTCAT 


GGTGGGTTTT 


CCCCCTTATC 


AGCGTGXTTG 


CCTTGCTCAG 


AGAGCAGATC 


4320 


CCAGAGCAGT 


GACACCTAAC 


TTAATTTTCA 


GCAAAACATT 


TTGAGAAGGG 


TGCTCCCTCA 


4380 


CACAACTACA 


CAGTCCAGGT 


GATGCACCCA 


CTGCCCAATG 


CTTGGTAGTC 


AAGAGGAGCT 


4440 



TCCTCCCTGC AGCTCTGCCC AGATAGGGCT GAG 4 473 
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ATTGGGAGCT 


GCCTCGTGTT 


CTGCAGCCTC 


ACAGACAGGA 


GGTCCAGTGC 


CGCTGCTCTG 


60 


TTCTGGAATA 
ACACAGGTAC 


TCCTCCTGAA 


TGTGTTTTGG 


GTGCAGTTGC 


CGTTTCTTTC 


ATCTTTTTAA 


120 


TTTTGGAGCT 


GGCCTTCTCA 


AGGAAGCCCA 


GCTCCCTGTG 


ATTGAGAATA 


180 


AAGTGTGCAA 


TCGCTATGAG 


XVIII 

TTTCTGAATG GAAGAGTCCA 


ATCCACCGAA 


CTCTGTGCTG 


240 


GGCATTTGGC 


CGGAGGCACT 


GACAGTTGCC 


AGGTAAGCAA 


AGATCAAGAG 


ACCAAAGTTA 


300 


GTCTTGTGCT 


CTCTTGTCTC 


AGTCTCAGCC 


C CTGAG ACTT 


CATTCCCCAG 


GTGGCAAATT 


360 


CAAGGATTTT 


CAACCGAAGA 


CCCCAGTCTA 


AGTGTTGTTT 


AGAAACTTCC 


TAGATCTGTC 


420 


CCTGAATGCG 


TATTCAGATC 


ATCTAAGGGG 


ATGTCTTGGG 


GCTTGAGTTC 


CAAATCAGTA 


480 


GCAAGCGAGT 


TTTAAGTGCC 


ATAACTACCT 


CAGGCCACTC 


ACCCTCCTGG 


GGTGTGCTGG 


540 


TGGCCAGGGA 


CTAAAGTGGT 


GACTTTTCCG 


GTAGGGAAGG 


AGGTAGAGGG 


TACAGGACAG 


600 


AGACCAACTG 


CACACACTTT 


ACACTGATGC 


CCAGGCTAGC 


CCAGTCTAAA 


GGAAACACCA 


660 


ACATAGGAAG 


GGATGTGTGC 


AGGATTCACA 


AAAGATCTTT 


TCTACCCCCC 


GGAAAAACTA 


720 


AGTGGTGTGG 


TTTCGCTAAA CAGATTTTGC 


TAAGTACTTA 


AGCACTGCAG 


ATGCTTGAGT 


780 


AATATGCTCA 


TAAGTTCCTT 


TCTGATTTCA 


ATTACTGGGA 


AAATGTATAT 


ATGGATAGTA 


840 


GAAGGATGGC 


ATCCCATAAT 


AAAAGGCAGG 


CAGCCTAACC 


CTCACATGCA 


TTTTTCTCTC 


900 


CCTCTGTATA 


GGGTGACAGT 


GGAGGGCCTC 


TGGTTTGCTT 


CGAGAAGGAC 


AAATACATTT 


960 


TACAAGGAGT 


CACTTCTTGG 


GGTCTTGGCT 


GTGCACGCCC 


CAATAAGCCT 


GGTGTCTATG 


1020 


TTCGTGTTTC 


AAGGTTTGTT 


XIX 

ACTTGGATTG AGGGAGTGAT 


GAGAAATAAT 


TAATTGGACG 


1080 


GGAGACAGAG 


TGACGCACTG 


ACTCACCTAG 


AGGCTGGGAC 


GTGGGTAGGG 


ATTTAGCATG 


1140 


CTGGAAATAA 


CTGGCAGTAA 


TCAAACGAAG 


ACACTGTCCC 


CAGCTACCAG 


CTACGCCAAA 


1200 


CCTCGGCATT 


TTTTGTGTTA 


TTTTCTGACT 


GCTGGATTCT 


GTAGTAAGGT 


GACATAGCTA 


1260 


TGACATTTGT 


TAAAAATAAA 


CTCTGTACTT 


AACTTTGATT 


TGAGTAAATT 


TTGGTTTTGG 


1320 


TCTTCAACAT 


TTTCATGCTC 


TTTGTTCACC 


CCACCAATTT 


TAAATGGGCA 


GATGGGGGGA 


1380 


TTTAGCTGCT 


TTTGATAAGG 


AACAGCTGCA 


CAAAGGACTG 


AGCAGGCTGC 


AAGGTCACAG 


1440 


AGGGGAGAGC 


CAAGAAGTTG 


TCCACGCATT 


TACCTCATCA 


GCTAACGAGG 


GCTTGACATG 


1500 


CATTTTTACT 


GTCTTTATTC 


CTGACACTGA 


GATGAATGTT 


TTCAAAGCTG 


CAACATGCAT 


1560 


GGGGAGTCAT 


GCGAACCGAT 


TCTGTTATTG 


GGAATGAAAT 


CTGTCACCGA 


CTGCTTGACT 


1620 
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TGAGCCCAGG GGACACAGAG CAGAGAGCTG TATATGATGG AGTGAACCGG TCCATGGATG 1680 

TGTAACACAA GACCAACTGA GAGTCTGAAT GTTATCCTGG GGCACACGTG AGTCTAGGAT 174 0 

TGGTGCCAAG AGCATGTAAA TGAACAACAA GCAAATATTG AAGGTGGACC ACTTATTTCC 1800 

CATTGCTAAT TGCCTGCCCG GTTTTGAAAC AGTCTGCAGT ACACACGGTG ACAGGAGAAT 1860 

GACCTGTGGG AGAGATACAT GTTTAGAAGG AAGAGAAAGG ACAAAGGCAC ACGTTTTACC 1920 

ATTTAAAATA TTGTTACCAA ACAAAAATAT CCATTCAAAA TACAATTTAA CAATGCAACA 1980 

GTCATCTTAC AGCAGAGAAA TGCAGAGAAA AGCAAAACTG CAAGTGACTG TGAATAAAGG 204 0 

GTGAATGTAG TCTCAAATCC TCAAAGAGCT GTGTTTATTT CATTGACAAA TAGATTATTT 2100 

GTATCAA 2107 
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FIG. 4 



Amplified FnulHI digest FnulHI digest 

DNA (1 hp) ( 4 hrs ) 




369 bp 
216 bp 
123 bp 
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